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FOREWORD 


| n bygone centuries, our physical world appeared to be filled to the brim with mysteries. Divine powers 
could provide for genuine miracles; water and sunlight could turn arid land into fertile pastures, but the 
same powers could lead to miseries and disasters. The force of life, the vis vitalis, was assumed to be the 
special agent responsible for all living things. The heavens, whatever they were for, contained stars and other 
heavenly bodies that were the exclusive domain of the Gods. 

Mathematics did exist, of course. Indeed, there was one aspect of our physical world that was recognised to 
be controlled by precise, mathematical logic: the geometric structure of space, elaborated to become a genuine 
form of art by the ancient Greeks. From my perspective, the Greeks were the first practitioners of ‘mathematical 
physics’, when they discovered that all geometric features of space could be reduced to a small number of 
axioms. Today, these would be called ‘fundamental laws of physics’. The fact that the flow of time could be 
addressed with similar exactitude, and that it could be handled geometrically together with space, was only 
recognised much later. And, yes, there were a few crazy people who were interested in the magic of numbers, 
but the real world around us seemed to contain so much more that was way beyond our capacities of analysis. 

Gradually, all this changed. The Moon and the planets appeared to follow geometrical laws. Galilei and 
Newton managed to identify their logical rules of motion, and by noting that the concept of mass could be 
applied to things in the sky just like apples and cannon balls on Earth, they made the sky a little bit more 
accessible to us. Electricity, magnetism, light and sound were also found to behave in complete accordance 
with mathematical equations. 

Yet all of this was just a beginning. The real changes came with the twentieth century. A completely new 
way of thinking, by emphasizing mathematical, logical analysis rather than empirical evidence, was pioneered 
by Albert Einstein. Applying advanced mathematical concepts, only known to a few pure mathematicians, to 
notions as mundane as space and time, was new to the physicists of his time. Einstein himself had a hard 
time struggling through the logic of connections and curvatures, notions that were totally new to him, but are 
only too familiar to students of mathematical physics today. Indeed, there is no better testimony of Einstein’s 
deep insights at that time, than the fact that we now teach these things regularly in our university classrooms. 

Special and general relativity are only small corners of the realm of modern physics that is presently being 
studied using advanced mathematical methods. We have notoriously complex subjects such as phase transitions in 
condensed matter physics, superconductivity, Bose-Einstein condensation, the quantum Hall effect, particularly 
the fractional quantum Hall effect, and numerous topics from elementary particle physics, ranging from fibre 
bundles and renormalization groups to supergravity, algebraic topology, superstring theory, Calabi-Yau spaces 
and what not, all of which require the utmost of our mental skills to comprehend them. 

The most bewildering observation that we make today is that it seems that our entire physical world 
appears to be controlled by mathematical equations, and these are not just sloppy and debatable models, but 
precisely documented properties of materials, of systems, and of phenomena in all echelons of our universe. 

Does this really apply to our entire world, or only to parts of it? Do features, notions, entities exist that are 
emphatically mot mathematical? What about intuition, or dreams, and what about consciousness? What 
about religion? Here, most of us would say, one should not even try to apply mathematical analysis, although 
even here, some brave social scientists are making attempts at coordinating rational approaches. 
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No, there are clear and important differences between the physical world and the mathematical world. 
Where the physical world stands out is the fact that it refers to ‘reality’, whatever ‘reality’ is. Mathematics is 
the world of pure logic and pure reasoning. In physics, it is the experimental evidence that ultimately decides 
whether a theory is acceptable or not. Also, the methodology in physics is different. 

A beautiful example is the serendipitous discovery of superconductivity. In 1911, the Dutch physicist Heike 
Kamerlingh Onnes was the first to achieve the liquefaction of helium, for which a temperature below 4.25 K 
had to be realized. Heike decided to measure the specific conductivity of mercury, a metal that is frozen solid 
at such low temperatures. But something appeared to go wrong during the measurements, since the volt 
meter did not show any voltage at all. All experienced physicists in the team assumed that they were dealing 
with a malfunction. It would not have been the first time for a short circuit to occur in the electrical 
equipment, but, this time, in spite of several efforts, they failed to locate it. One of the assistants was 
responsible for keeping the temperature of the sample well within that of liquid helium, a dull job, requiring 
nothing else than continuously watching some dials. During one of the many tests, however, he dozed off. 
The temperature rose, and suddenly the measurements showed the normal values again. It then occurred to 
the investigators that the effect and its temperature dependence were completely reproducible. Below 4.19 
degrees Kelvin the conductivity of mercury appeared to be strictly infinite. Above that temperature, it is 
finite, and the transition is a very sudden one. Superconductivity was discovered (D. van Delft, “Heike 
Kamerling Onnes”, Uitgeverij Bert Bakker, Amsterdam, 2005 (in Dutch)). 

This is not the way mathematical discoveries are made. Theorems are not produced by assistants falling 
asleep, even if examples do exist of incidents involving some miraculous fortune. 

The hybrid science of mathematical physics is a very curious one. Some of the topics in this Encyclopedia 
are undoubtedly physical. High T; superconductivity, breaking water waves, and magneto-hydrodynamics, 
are definitely topics of physics where experimental data are considered more decisive than any high-brow 
theory. Cohomology theory, Donaldson-Witten theory, and AdS/CFT correspondence, however, are examples 
of purely mathematical exercises, even if these subjects, like all of the others in this compilation, are strongly 
inspired by, and related to, questions posed in physics. 

It is inevitable, in a compilation of a large number of short articles with many different authors, to see quite a 
bit of variation in style and level. In this Encyclopedia, theoretical physicists as well as mathematicians together 
made a huge effort to present in a concise and understandable manner their vision on numerous important 
issues in advanced mathematical physics. All include references for further reading. We hope and expect that 
these efforts will serve a good purpose. 


Gerard 't Hooft, 
Spinoza Institute, 


Utrecht University, 
The Netherlands. 


PREFACE 


athematical Physics as a distinct discipline is relatively new. The International Association of 

Mathematical Physics was founded only in 1976. The interaction between physics and mathematics 
has, of course, existed since ancient times, but the recent decades, perhaps partly because we are living 
through them, appear to have witnessed tremendous progress, yielding new results and insights at a dizzying 
pace, so much so that an encyclopedia seems now needed to collate the gathered knowledge. 

Mathematical Physics brings together the two great disciplines of Mathematics and Physics to the benefit of 
both, the relationship between them being symbiotic. On the one hand, it uses mathematics as a tool to 
organize physical ideas of increasing precision and complexity, and on the other it draws on the questions 
that physicists pose as a source of inspiration to mathematicians. A classical example of this relationship 
exists in Einstein's theory of relativity, where differential geometry played an essential role in the formulation 
of the physical theory while the problems raised by the ensuing physics have in turn boosted the development 
of differential geometry. It is indeed a happy coincidence that we are writing now a preface to an 
encyclopedia of mathematical physics in the centenary of Einstein's annus mirabilis. 

The project of putting together an encyclopedia of mathematical physics looked, and still looks, to us a 
formidable enterprise. We would never have had the courage to undertake such a task if we did not believe, 
first, that it is worthwhile and of benefit to the community, and second, that we would get the much-needed 
support from our colleagues. And this support we did get, in the form of advice, encouragement, and 
practical help too, from members of our Editorial Advisory Board, from our authors, and from others as well, 
who have given unstintingly so much of their time to help us shape this Encyclopedia. 

Mathematical Physics being a relatively new subject, it is not yet clearly delineated and could mean 
different things to different people. In our choice of topics, we were guided in part by the programs of recent 
International Congresses on Mathematical Physics, but mainly by the advice from our Editorial Advisory 
Board and from our authors. The limitations of space and time, as well as our own limitations, necessitated 
the omission of certain topics, but we have tried to include all that we believe to be core subjects and to cover 
as much as possible the most active areas. 

Our subject being interdisciplinary, we think it appropriate that the Encyclopedia should have certain 
special features. Applications of the same mathematical theory, for instance, to different problems in physics 
will have different emphasis and treatment. By the same token, the same problem in physics can draw upon 
resources from different mathematical fields. This is why we divide the Encyclopedia into two broad sections: 
physics subjects and related mathematical subjects. Articles in either section are deliberately allowed a fair 
amount of overlap with one another and many articles will appear under more than one heading, but all are 
linked together by elaborate cross referencing. We think this gives a better picture of the subject as a whole 
and will serve better a community of researchers from widely scattered yet related fields. 

The Encyclopedia is intended primarily for experienced researchers but should be of use also to beginning 
graduate students. For the latter category of readers, we have included eight elementary introductory articles for easy 
reference, with those on mathematics aimed at physics graduates and those on physics aimed at mathematics 
graduates, so that these articles can serve as their first port of call to enable them to embark on any of the main 
articles without the need to consult other material beforehand. In fact, we think these articles may even form the 
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foundation of advanced undergraduate courses, as we know that some authors have already made such use of them. 

In addition to the printed version, an on-line version of the Encyclopedia is planned, which will allow both 
the contents and the articles themselves to be updated if and when the occasion arises. This is probably a 
necessary provision in such a rapidly advancing field. 

This project was some four years in the making. Our foremost thanks at its completion go to the members 
of our Editorial Advisory Board, who have advised, helped and encouraged us all along, and to all our 
authors who have so generously devoted so much of their time to writing these articles and given us much 
useful advice as well. We ourselves have learnt a lot from these colleagues, and made some wonderful 
contacts with some among them. Special thanks are due also to Arthur Greenspoon whose technical expertise 
was indispensable. 

The project was started with Academic Press, which was later taken over by Elsevier. We thank warmly 
members of their staff who have made this transition admirably seamless and gone on to assist us greatly in 
our task: both Carey Chapman and Anne Guillaume, who were in charge of the whole project and have been 
with us since the beginning, and Edward Taylor responsible for the copy-editing. And Martin Ruck, who 
manages to keep an overwhelming amount of details constantly at his fingertips, and who is never known to 
have lost a single email, deserves a very special mention. 

As a postscript, we would like to express our gratitude to the very large number of authors who generously 
agreed to donate their honorariums to support the Committee for Developing Countries of the European 
Mathematical Society in their work to help our less fortunate colleagues in the developing world. 


Jean-Pierre Françoise 
Gregory L. Naber 
Tsou Sheung Tsun 


GUIDE TO USE OF THE ENCYCLOPEDIA 


otructure of the Encyclopedia 


The material in this Encyclopedia is organised into two sections. At the start of Volume 1 are eight Introductory Articles. 
The introductory articles on mathematics are aimed at physics graduates; those on physics are aimed at mathematics 
graduates. It is intended that these articles should serve as the first port of call for graduate students, to enable them to 
embark on any of the main entries without the need to consult other material beforehand. 

Following the Introductory Articles, the main body of the Encyclopedia is arranged as a series of entries in alphabetical 
order. These entries fill the remainder of Volume 1 and all of the subsequent volumes (2-5). 

To help you realize the full potential of the material in the Encyclopedia we have provided four features to help you find 
the topic of your choice: a contents list by subject, an alphabetical contents list, cross-references, and a full subject index. 


1. Contents List by Subject 


Your first point of reference will probably be the contents list by subject. This list appears at the front of each volume, 
and groups the entries under subject headings describing the broad themes of mathematical physics. This will enable the 
reader to make quick connections between entries and to locate the entry of interest. The contents list by subject is divided 
into two main sections: Physics Subjects and Related Mathematics Subjects. Under each main section heading, you will 
find several subject areas (such as GENERAL RELATIVITY in Physics Subjects or NONCOMMUTATIVE GEOMETRY 
in Related Mathematics Subjects). Under each subject area is a list of those entries that cover aspects of that subject, 
together with the volume and page numbers on which these entries may be found. 

Because mathematical physics is so highly interconnected, individual entries may appear under more than one subject 
area. For example, the entry GAUGE THEORY: MATHEMATICAL APPLICATIONS is listed under the Physics Subject 
GAUGE THEORY as well as in a broad range of Related Mathematics Subjects. 


2. Alphabetical Contents List 


The alphabetical contents list, which also appears at the front of each volume, lists the entries in the order in which they 
appear in the Encyclopedia. This list provides both the volume number and the page number of the entry. 

You will find *dummy entries" where obvious synonyms exist for entries or where we have grouped together related 
topics. Dummy entries appear in both the contents list and the body of the text. 


Example 
If you were attempting to locate material on path integral methods via the alphabetical contents list: 


PATH INTEGRAL METHODS see Functional Integration in Quantum Physics; Feynman Path Integrals 


The dummy entry directs you to two other entries in which path integral methods are covered. At the appropriate 
locations in the contents list, the volume and page numbers for these entries are given. 

If you were trying to locate the material by browsing through the text and you had looked up Path Integral Methods, 
then the following information would be provided in the dummy entry: 


Path Integral Methods see Functional Integration in Quantum Physics; Feynman Path Integrals 
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3. Cross-References 


All of the articles in the Encyclopedia have been extensively cross-referenced. The cross-references, which appear at the 
end of an entry, serve three different functions: 


i. To indicate if a topic is discussed in greater detail elsewhere. 
ii. To draw the reader's attention to parallel discussions in other entries. 


iii. To indicate material that broadens the discussion. 


Example 
The following list of cross-references appears at the end of the entry STOCHASTIC HYDRODYNAMICS 


See also: Cauchy Problem for Burgers-Type Equations; Hamiltonian 
Fluid Dynamics; Incompressible Euler Equations: Mathematical Theory; 
Malliavin Calculus; Non-Newtonian Fluids; Partial Differential Equations: 
Some Examples; Stochastic Differential Equations; Turbulence Theories; 
Viscous Incompressible Fluids: Mathematical Theory; Vortex Dynamics 


Here you will find examples of all three functions of the cross-reference list: a topic discussed in greater detail elsewhere 
(e.g. Incompressible Euler Equations: Mathematical Theory), parallel discussion in other entries (e.g. Stochastic Differ- 
ential Equations) and reference to entries that broaden the discussion (e.g. Turbulence Theories). 

The eight Introductory Articles are not cross-referenced from any of the main entries, as it is expected that introductory 
articles will be of general interest. As mentioned above, the Introductory Articles may be found at the start of Volume 1. 


4. Index 


The index will provide you with the volume and page number where the material is located. The index entries 
differentiate between material that is a whole entry, is part of an entry, or is data presented in a figure or table. Detailed 
notes are provided on the opening page of the index. 
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A full list of contributors appears at the beginning of each volume. 
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Our society is often designated as being an “infor- 
mation society.” It could also be defined as an 
“image society.” This is not only because image is a 
powerful and widely used medium of communica- 
tion, but also because it is an easy, compact, and 
widespread way to represent the physical world. If 
we think about it, it is indeed striking to realize just 
how much images are omnipresent in our lives 
through numerous applications such as medical and 
satellite imaging, videosurveillance, cinema, 
robotics, etc. 

Many approaches have been developed to process 
these digital images, and it is difficult to say which 
one is more natural than the other. Image processing 
has a long history. Maybe the oldest methods come 
from 1D signal processing techniques. They rely on 
filter theory (linear or not), on spectral analysis, or 
on some basic concepts of probability and statistics. 
For an overview, we refer the interested reader to 
the book by Gonzalez and Woods (1992). 

In this article, some recent mathematical concepts 
will be revisited and illustrated by the image 
restoration problem, which is presented below. We 
first discuss stochastic modeling which is widely 
based on Markov random field theory and deals 
directly with digital images. This is followed by a 
discussion of variational approaches where the 
general idea is to define some cost functions in a 
continuous setting. Next we show how the scale 
space theory is connected with partial differential 
equations (PDEs). Finally, we present the wavelet 
theory, which is inherited from signal processing 
and relies on decomposition techniques. 


Introduction 


As in the real world, a digital image is composed of 
a wide variety of structures. Figure 1 shows different 


kinds of “textures,” progressive or sharp contours, 
and fine objects. This gives an idea of the complex- 
ity of finding an approach that allows to cope with 
the different structures at the same time. It also 
highlights the discrete nature of images which will 
be handled differently depending on the chosen 
mathematical tools. For instance, PDEs based 
approaches are written in a continuous setting, 
referring to analogous images, and once the exist- 
ence and the uniqueness of the solution have been 
proved, we need to discretize them in order to find a 
numerical solution. On the contrary, stochastic 
approaches will directly consider discrete images in 
the modeling of the cost functions. 


The Image Restoration Problem 


It is well known that during formation, transmis- 
sion, and recording processes images deteriorate. 
Classically, this degradation is the result of two 
phenomena. The first one is deterministic and is 
related to the image acquisition modality, to possible 
defects of the imaging system (e.g., blur created by 
an incorrect lens adjustment or by motion). The 
second phenomenon is random and corresponds to 
the noise coming from any signal transmission. It 
can also come from image quantization. It is 
important to choose a degradation model as close 
as possible to reality. The random noise is usually 
modeled by a probabilistic distribution. In many 
cases, a Gaussian distribution is assumed. However, 
some applications require more specific ones, like 
the gamma distribution for radar images (speckle 
noise) or the Poisson distribution for tomography. 
Unfortunately, it is usually impossible to identify the 
kind of noise involved for a given real image. 

A commonly used model is the following. Let 
u:Q c R^ — R be an original image describing a real 
scene, and let f be the observed image of the same 
scene (i.e., a degradation of u). We assume that 


f — Au n |] 


where 7) stands for a white additive Gaussian noise 
and A is a linear operator representing the blur 
(usually a convolution). Given f, the problem is 
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(a) 


Figure 1 Digital image example. ^ the close-ups show 
examples of low resolution, low contrasts, graduated shadings, 
sharp transitions, and fine elements. (a) low resolution, (b) low 
contrasts, (c) graduated shadings, (d) sharp transitions, and 
(e) fine elements. 


then to reconstruct u knowing [1]. This problem 
is ill.posed, and we are able to carry out only an 
approximation of z. In this article, we will focus on 
the simplified model of pure denoising: 


f=ut+n 2 | 


The Probabilistic Approach 
The Bayesian Framework 


In this section, we show how the problem of pure 
denoising, that is, recovering u from the equation 
f =u +n knowing only some statistical information 
on 7 can be solved by using a probabilistic 
approach. In this context, f, 4, and 7 are considered 
as random variables. The general idea for recovering 
4 is to maximize some prior probability. Most 
models involve two parts: a prior model of possible 
restored images u and a data model expressing 
consistency with the observed data. 


e The prior model is given by a probability space 
(Q,, p), where Q, is the set of all values of u. The 
model is specified by giving the probability p(u) 
on all these values. 

e The data model is a larger probability space 
(Qu c, p), where Q, ¢ is the set of all possible values 
of u and all possible values of the observed image 
f. This model is completed by giving the condi- 
tional probability p(f/u) of any image f given u, 
irren, in the joint probabilities p(f,u) — 

p(f /u)p(u). Implicitly, we assume that the spaces 
zi ) ip ^ 4,f) are finite although huge. 


The next step is to use a Bayesian approach 
introduced in image processing by Besag (1974) 
and Geman and Geman (1984). The probabilities 
p(u) and p(f/u) are supposed to be known and, 
given an observed image f, we seek the image 
4 which maximizes the conditional a posteriori 


probability p(u/f) (MAP: Maximum A Posteriori). 
Thanks to the Bayes' rule, we have 


uU ey — Pewee) 
p(u/f) — Of) [3] 


Let us explain the meaning of the different terms 


in [3]: 


e The term p(f/u) expresses the probability, the 
likelihood, that an image z is realized in f. It also 
quantifies the lack of total precision of the model 
and the presence of noise. 

e The term p(u) expresses our incomplete a priori 
information about the ideal image u (it is the 
probability of the model, i.e., the propensity that 
u be realized independently of the observation f). 

e The term p(f) which is the probability to observe f 
is a constant and does not play any role when 
maximizing the conditional probability p(u/f) 
with respect to u. 


Let us remark that the problem max,p(u/f) is 
equivalent to min, E(u) — —log p(f /u) — log p(u). 
So Bayesian models lead to a minimization 
process. 

Then the main question is how to assign these 
probabilities? The easiest probability to determine is 
p(f /u). If the images u and f consist in a set of values 
M (tti j), l,j = 1,N and f= Fh = 1, N, we sup- 
pose the conditional independence of (f; ;/ui, j) in any 
pixel: 


N 


p(f/u) =|] (fi, /Ui;) 


f=] 


and if the restoration model is of the form f =u +n 


where n is a white Gaussian noise with variance c^, 


then 


|. 1 (fij — (fij — wig)" 
(fij nij) ia J 210 exp — 292 
and 
u 1 (fij — (fij — Mig)” 
p(f /u) = Qno) NT exp — pin 22 
Therefore, at this stage, the MAP reduces to 
minimize 


E(u) = K,|lf — ul? — log p(w) [4] 


where ||.|| stands for the Euclidean norm on R` and 
K, is a constant. So, it remains now to assign a 
probability law p(u). To do that, the most common 
way is to use the theory of Markov random fields 
(MRFs). 


The Theory of Markov Random Fields 


In this approach, an image is described as a finite set 
S of sites corresponding to the pixels. For each site, 
we associate a descriptor representing the state of 
the site, for example, its gray level. In order to take 
into account local interaction between sites, one 
needs to endow S with a system of neighborhoods V. 


Definition 1 For each site s, we define its neighbor- 


hood V(s) as 
V(s) = (t) such that sé V(s) and t € V(s) scW(t) 


Then we associate to this neighborhood system the 
notion of clique: a clique is either a singleton or a set 
of sites which are all neighbors of each other. 
Depending on the neighborhood system, the family 
of cliques will be different and involve more and less 
sites. We will denote by C the set of all the cliques 
relative to a neighborhood system Y (see Figure 2). 

Before introducing the general framework of 
MRFs, let us define some notations. For a site s, 
X; will stand for a random variable taking its values 
in some set E (e.g., € = (0,1,...,255]) and x, will be 
a realization of X, and x? —(x;),4, will denote an 
image configuration where site s has been removed. 
Finally, we will denote by X the random variable 
X =(X,, X;,...) with values in Q— £95, 


Definition 2 We say that X is an MRF if the local 
conditional probability at a site s is only a function 


of V(s), that is, 


p(X; = xs/X = x) = p(X 


Therefore, the gray level at a site depends only on 
gray levels of neighboring pixels. Now we give the 
following fundamental theorem due to Hammersley- 
Clifford (Besag 1974) which states the equivalence 
between MRFs and Gibbs fields. 


Theorem 1 Let us suppose tbat S is finite, € is a 
discrete set and for all x €Q— £9, p(X —x) > 0, 
then X is an MRF relatively to a system of 
neighborhoods V if and only if there exists a family 
of potential functions (V.)-~c such that 
p(x) =(1/Z) éxp(—) er V.(x)). 

The function V(x)=)>.-¢V.(x) is called the 
energy potential or tbe Gibbs measure and Z is a 
normalizing constant: Z = exp(—» ^, 9 V(x)). 


m Xs/ Xt, tE V(s)) 


Figure 2 Examples of neighborhood system and cliques. 
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If, for example, the collection of neighborhoods is 
the set of 4-neighbors, then the theorem says that 


V(x) = Qoc=tspec, Vel%s) 22:5 ts pec, Velss xi). 


Application to the Denoising Problem 


Now, given this theorem we can reformulate, thanks 

to [4], the restoration problem (with the change of 

notation 4—x and z,—x,): find u minimizing the 
global energy 

2 

E(u) = Kellf — ul + V(u) [5] 

The next step is now to precise the Gibbs 

measure. In restoration, the potential V(u) is often 

dedicated to impose local regularity constraints, for 

example, by penalizing differences between neigh- 


bors. This can be modeled using cliques of order 2 in 
the following manner: 


)28 M. dlm 


(s,t) € C» 


where ¢ is a given real function. This term penalizes 
the difference of intensities between neighbors which 
may come from an edge or some noise. This discrete 
cost function is very similar to the gradient penalty 
terms in the continuous framework (see the next 
section). The resulting final energy is (sometimes 
E(u) is written E(u/f)) 


=y S (fA - y+, >, (us — ur) 


scs (s.t) € C2 


where the constant B is a weighting parameter 
which can be estimated. 

The difficulty in choosing the strength of the 
penalty term defined by ó is to be able to penalize 
the noise while keeping the most salient features, 
that is, edges. Historically, the function ¢ was first 
chosen as ó(z) — z^ but this choice is not good since 
the resulting regularization is too strong introducing 
a blur in the -- and loss of the edges. A better 
choice is $(z)—|z| (Rudin et al. 1992) or a 
regularized version of this function. Of course, 
other choices are possible depending on the con- 
sidered application and the desired degree of 
smoothness. 

In this section, it has been shown how to model 
the restoration problem through MRFs and the 
Bayesian framework. Numerically, two main types 
of algorithms can be used to minimize the energy: 
deterministic algorithms and stochastic algorithms. 
The former are generally used when the global 
energy is strictly convex (e.g., algorithms based on 
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gradient descent). The latter are rather used when 
E(u) is not convex. There are stochastic minimiza- 
tion algorithms mainly based on simulated anneal- 
ing. Their main interest is that they always converge 
(almost surely) to a minimizer (this is not the case 
for deterministic algorithms which give only local 
minimizers) but they are often strongly time 
consuming. 

We refer the reader to Li (1995) for more details 
about MRFs and Bayesian framework and 
Kirkpatrick et al. (1983) for more information on 
stochastic algorithms. 


The Variational Approach 


Minimizing a Cost Function over a 
Functional Space 


One important issue in the previous section was the 
definition of p(u) which gives some a priori on the 
solution. In the variational approach, this idea is 
also present but the way to infer it is in fact to 
define the more suitable functional space that 
describes images and their geometrical properties. 
The choice of a functional space sets a norm which 
in turn will constrain the solution to a certain 
smoothness. 

We illustrate this idea in this section on the 
denoising problem [2] which can be seen as a 
decomposition one. This means that given the 
observation f, we look for u and ņ such that 
f —u--7, where 7 incorporates all oscillations, that 
is, noise, and also texture. Let us define a functional 
to be minimized which takes into account the data f 
and possibly some statistical informations about 7: 


min{ @(|z|,-) such that (||) —c 
(44.77) [6] 
with f =u +n} 


This formulation means that we look, among, all 
decompositions f =u + n, for the one which mini- 
mizes ó(|4|;) under the constraint w(|n|;)-—o. 
Banach spaces E and G, and functions @ and v 
will be discussed in the next subsection. Since a 
minimization problem under constraints can be 
expressed with an additional term weighted by a 


Lagrange multiplier, the formulation [6] can be 
rewritten as: 


min{ (lule) + Aw(Inlg):f =u + n} [7] 


A similar writing consists in replacing 7 by f — u so 
that [7] rewrites 


min(ó(|u|z) + Av(If — ul) i8] 


which is the classical formulation in image restora- 
tion. From a numerical point of view, the minimiza- 
tion is usually carried out by solving the associated 
Euler equations but this may be a difficult task. The 
main concern is the search for E and G and their 
norm (or seminorm). It is guided by the choice that 
an image u is composed of various geometric 
structures (homogeneous regions, edges) while 
1 =f — u represents oscillations (noise and textures). 


Examples of Functional Spaces 


In this section, we revisit some possible choices of 
functional spaces summarized in Table 1. 

The first case (a) was inspired by the classical 
Tikhonov regularization. The functional space 
H'(Q)(Q C R?) is the space of functions in L?(Q) 
such that the distributional gradient Du is in L*(Q). 
Unfortunately, functions in H'(Q) do not admit 
discontinuities across curves and this is a major 
problem with respect to image analysis since images 
are made of smooth patches separated by sharp 
variations. 

Considering the problem reported in (a), Rudin et al. 
(1992) proposed to work on BV(QO), the space of 
bounded variations (BV) Ambrosio et al. (2000) 
defined by 


BV(Q) — TIT j IDu| < xi 
JQ 


with f Du -wpl | udivy dx: 
Q © 


Table 1 Examples of functional spaces and their norm (see model [8]) 


Model E and |u|; p(t) 
1/2 

(a) H'(Q), lule = ( f, | vu dx) f? 

(b) BV(Q), jule = Jp, |Du| t 

(c) BV(0), |ule = fa Dui ! 


Loy N 
P= (Pi: P3,- PN) EG), 
Pl (o) = i} 9] 
G and |u|, u(t) 
L*(Q) with its usual norm i 
L?(Q) with its usual norm t? 


{b € L?(Q); b — dive, |£ 


L* (n < 1, €: Njan =O} t 


It is equivalent to define BV(Q2) as the space of 
L'(Q) functions whose distributional gradient Du is 
a bounded measure and [9] is its total variation. The 
space BV(Q2) has some interesting properties: 


l. lower semicontinuity of the total variation 
Jq|Du| with respect to the L'(Q) topology, 

2. if we BV(Q), we can define, for H! almost 
everywhere x € $,, the complement of Lebesgue 
points (i.e., the jump set of u), a normal z,(x) 
and two approximate “right” and “left” limits 
u'(x) and u (x), and 

3. Du can be decomposed as a sum of a regular 
measure, a jump measure, and a Cantor measure: 


Du = Vudx-4 (ut — u )ngHs + C, 


where Vu is the approximate gradient and H! the 
one-dimensional Hausdorff measure. 


This ability to describe functions with disconti- 
nuities across a hypersurface $, makes BV(Q2) very 
convenient to describe images with edges. In this 
context, the image restoration problem is well 
posed and suitable numerical tools can be proposed 
(Chambolle and Lions 1997). 

One criticism of the model (b) in Table 1 pointed 
out by Meyer (2001) is that if f is a characteristic 
function and if f is sufficiently small with respect to 
a suitable norm, then the model (Rudin et al. 1992) 
gives 4—0 and 7=f contrary to what one should 
expect (4 — f and 1j — 0). In fact, the main reason of 
this phenomenon is that the L?-norm for the n 
component is not the right one since very oscillating 
functions can have large  lj?-norm  (e.g., 
falx) = cos(nx)). To better describe such oscillating 
functions, Meyer (2001) introduced the space of 
functions which can be expressed as a divergence 
of L*-fields. This work was developed in R and 
this framework was adapted to bounded 2D 
domains by Aubert and Aujol (2005) (see (c) in 
Table 1). An example of image decomposition is 
shown in Figure 3. 

In this section, we have shown how the choice of 
the functional spaces is closely related to the 
definition of a variational formulation. The 


Original u 7 


Figure 3 Example of image decomposition (see Aubert and 
Aujol (2005)). 
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functionals are written in a continuous setting and 
they can usually be minimized by solving the 
discretized Euler equations iteratively, until conver- 
gence. These PDEs and the differential operators are 
constrained by the energy definition but it is also 
possible to work directly on the equations, forget- 
ting the formal link with the energy. Such an 
approach has also been much developed in the 
computer vision community and it is illustrated in 
the next section. 

We refer the reader to Aubert and Kornprobst 
(2002) for a general review of variational 
approaches and PDEs as applied to image analysis. 


Scale Spaces and PDEs 


Another approach to perform nonlinear filtering 
is to define a family of image smoothing operators 
T;, depending on a scale parameter t. Given an 
image f(x), we can define the image u(t, x) = (T;f (x) 
which corresponds to the image f analyzed at scale t. 
In this section, following Alvarez-Guichard-Lions- 
Morel (Alvarez et al. 1993), we show that u(t, x) 
is the solution of a PDE provided some suitable 
assumptions on T;. 


Basic Principles of a Scale Space 


This section describes some natural assumptions to 
be fulfilled by scale spaces. We first assume that the 
output at scale t can be computed from the output at 
a scale t — b for very small h. This is natural, since a 
coarser scale view of the original picture is likely to 
be deduced from a finer one. T; is obtained by 
composition of transition filters, denoted by T;,, ;. 
So the first axiom is 


(A1) Tj,5— Tj,5,T, To=Id 


Another assumption is that operators act locally, 
that is, (Tp f)(x) depends essentially upon the 
values of f(y) with y in a small neighborhood of x. 
Taking into account the fact that as the scale 
increases, no new feature should be created by the 
scale space, we have the local comparison principle: 
if an image u is locally brighter than another image 
v, then this order must be conserved by the analysis. 
This is expressed by: 


(A2) For all 4 and v such that u(y) » v(y) in a 
neighborhood of x and y Æ x, then for h small 
enough, we have 


(Tipat) (x) > (Trip v)(x) 


The third assumption states that a very smooth 
image must evolve in a smooth way with the scale 
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space. Denoting the scalar product of two vectors of 
RN by «x, y», this assumption can be written as 


(A3) Let. u(y)=1/2(A(y - x) y - x) - (p,y ^x) +e 
be a quadratic form of R?, x fixed 
(A — V^u(x) € SV the set of 2x2 symmetric 
matrices, p — Vu(x) a vector of R*,c=u(x) a 
constant.) We shall say that a scale space is 
regular if there exists a function F(t, x, c, p, A), 
continuous with respect to A, such that 


(Tpu — u)(x) m 
h 


F(t,x,c,p,A) when b-0 


Scale Spaces are Governed by PDEs 


In the following theorem, it is stated that the former 
assumptions are sufficient to prove that scale spaces 
are in fact governed by PDEs. 


Theorem 2 Under assumptions A1, A2, A3, there 
exists a continuous function F:[0, T| x Ox R x 
R? xS? +R satisfying F(t,x,c,p, A) > F(t,x,c,p, B) 
for all p c R?, A and B in SP with A > B such that 


ó,(u) 


Ti, —u 
BER. alla — F(t,x,u, Vu, V^u), 


ENT. 
2 b—0* [10] 


uniformly for x € R*, uniformly for u. 


In eqn [10], the left-hand side term can be 
interpreted as the partial temporal derivative with 
respect to £ so that the notion of PDEs arises. More 
precisely, if f is continuous and uniformly bounded, 
then it can be established that u(t, x) =(T;f)(x) is the 
viscosity solution(see Definition 3) of 


" H(t, x, wu, Vu, V^u) = 0 (here H = —F) 


u(0,x) = f(x) 


41] 


The map H:[0, T] x Ox Rx R? x S? »R is called 
a Hamiltonian and the decreasing property of H 
with respect to S is called degenerate ellipticity. 

The theory of viscosity solutions was introduced 
in the 1980s by Crandall and P L Lions (Crandall 
and Lions 1981, Crandall et al. 1992). When strong 
solutions of [11] do not exist, this theory allows 
to define solutions which are only continuous or 
even discontinuous. The definition of viscosity 
solutions is 


Definition 3 Let H:Q x R x R? x SP eR be con- 
tinuous and degenerate elliptic and let c C? 


([0, T] x Q). Then z is a viscosity solution of [11] 
in [0, T] x Q if and only if 


(i) u is a subsolution, that is, V? € C^([0, T] x Q), 
V(to, xo) a local strict maximum point of (u — 4) 
(t,x), we have 


O 
E (fo, xo) a H(to, xo , u(to, Xo), V (to, xo), 


V^ó(to,xo)) < 0 


(ii) 4 is a supersolution, that is, Vó € C?([0, T] x Q), 
V(to,xo) a local strict minimum point of (u — ¢) 
(t,x), we have 


o 
= (to, xo) + H(to, xo , (to, xo), V (to, xo), 


V^ó(to,xo)) > 0 


In this definition, it is noticeable that derivatives of 
u are replaced by the derivatives of the test functions 
@. Obviously, it can be verified that this notion of 
weak solutions coincides with classical solution 
when u has enough regularity. 


Diffusion Operators Coming from the Scale Space 


A step further is to assume additional properties on 
the scale spaces and estimate the corresponding 
operator. Invariance properties include geometric 
invariance axioms, contrast invariance, or scale 
invariance. For example, if we assume the axioms 
A1—A3, gray-level shift invariance: 


(I1) T,(0) 20, T;(« + c) 2 T;(u) +c for all u and all 
constant c. 


and translation invariance: 


(I2) T,(75.4) =7,.(T,u) for all h in R?,t > 0, where 
(Talx) = u(x +b). 


Then it can be established that F in [10] is 
independent of (x,u), that is, u(t,x)=(T;f)(x) is 
the unique viscosity solution of 


e = F(Vu, V^u) 


u(0, x) = f(x) 


With more precise assumptions, one can even 
recover explicitly the operator F. As an example, if 
we look for a linear scale space which verifies some 
isometry assumption: 


(I3) T,(R.u)(x) = R.(T;u)(x) for all orthogonal trans- 
formation R on R*, where (R.u)(x) — u( Rx). 


Then it can be proved that the scale space is the 
unique solution of the heat equation: 


Ou 
5, 54-0 12] 
u(0, x) — f (x) 


Figure 4 is an example of [12] applied to a noisy 
image at different scale, that is, at different time. 
Note that noise is quickly removed but one has to 
stop the evolution very early if we would like to 
preserve some edges. In the nonlinear cases, several 
operators have also been found based on curvature. 
For instance, under suitable axioms (Alvarez et al. 
1993), including contrast, scale, and affine invari- 
ance, the associated scale space is 
Ou 


OH onto uid xul — 
2r sign(&)(£«) ^'|Vu| = 0 


_ a (Ve 13 
where & = div( 5.) [13] 
u(0,x) = f (x) 


This equation is called affine morphological scale 
space (AMSS) and three restored images are shown 
in Figure 5. Some qualitative differences are shown 
in Figure 6. 


Original image 40 iterations 90 iterations 


Figure 4 Illustration of heat equation [12]. 


90 iterations 150 iterations 


Figure 5 Illustration of the AMSS model [13]. 


Original image 40 iterations 


Heat AMSS Heat AMSS 


Figure 6 Some close-ups of Figures 4 and 5 showing 
qualitative differences after 40 iterations. 
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Remark Scale space theory has shown the formal 
link between some operators and PDEs. It has to be 
noticed that one may propose some PDEs which do 
not directly come from the scale space framework. 
Starting from [12] which performs isotropic smooth- 
ing and smears edges, many nonlinear diffusion 
models have been proposed to smooth images while 
preserving edges (see e.g. Perona and Malik 
(1990)). o 


To know more on scale space and PDEs, we refer 
the reader to Weickert (1998) and Aubert and 
Kornprobst (2002). 


The Wavelet Approach 


Before the 1980s, the Fourier transform played a 
major role for analyzing oscillating signals. The 
interest of such a transform for real application 
increased after the discovery of the fast Fourier 
transform. However, the Fourier transform has 
some limit. The Fourier transform extracts from 
the signal details of the frequency content but loses 
all information on the location of particular fre- 
quency. Moreover, for computing the Fourier trans- 
form Ff(à), we need to know f(t) for all the real 
values of t. These difficulties can be overcome by 
first windowing the signal, and then by taking its 
Fourier transform: 


FFA) = | fies - nes 


where g is a window function. The parameter A 
plays the role of a frequency localized around the 
abscissa ¢ of the temporal signal and 7""f(A, t) give 
an information about what is happening around 
s=t, for the frequency A. The main drawback of 
this method is that the window has a fixed length 
which is a serious disadvantage when we want to 
treat signals having variations of different orders of 
magnitude. All these issues highlighted that a 
mathematical theory of time-frequency representa- 
tion was necessary. This was achieved with the 
wavelet representation. In this section, we first recall 
some elements of this theory (for 1D signal) and 
then we show how it can be applied for restoring 
noisy images. 


The Wavelet Decomposition 


The basic idea is to construct from a function v, 
called mother wavelet, an orthonormal basis (v; ,} of 
L^(R) deduced from y by translation and dilatation. 
It is required that w be regular, oscillating (but not 
too much), that » and Fw are well localized and that 
i» has some null moments. Once this function v is 
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chosen, we set w;,(x)— 2/^w(2/t — k), jk EZ. An 
elegant and practical way for obtaining such a basis is 


to construct a multiresolution analysis of L^(R) 
(Mallat 1989). 


Definition 4 A multiresolution analysis of L*(R) is 
a sequence V;, j € Z of subspaces of L^(R), with the 
following properties: 


à Q V=, 

: i) V Via, 

(iii U V= L?(R), 

(iv) f(t )e V; if and only if f(2t) € Vj,1, and 

(v) There exists a regular function ó = compact 
support such that the family ó(t — k), k € Z, is 
an orthonormal basis of Vo for p scalar 
product of L^(R). Such a function ó is called a 
scaling function. 


Then it is straightforward to check that the family 
D; z(t) defined by 6; g(t) — 2i/24(2/t — k) is an ortho- 
normal basis of Vj. 

A basic example of multiresolution analysis of 
L'(R) is to choose Vo as the set of piecewise 
constant functions on R and take œ as the 
characteristic function of the interval  [0, 1): 
p(t) = xo, 1) (2). 

Let us now look at the link between wavelet basis 
and multiresolution analysis. We just give main 
ideas, all details can be found in the work of Mallat 
(1989). Assume that we have a multiresolution 
analysis, and let us define Wo as the orthogonal 
complement of Vp in V4. We build the mother 
wavelet ù by imposing that the family y(t — k), 
k € Z, is an orthonormal basis of Wo. For example, 


if (t)=xj0,1)(t), it can be shown that v(t)— 
Xt0, 1/2)(£) — xpi/2,1(£) (called the Haar wavelet). By 
change of scale, one gets that the family 


W p(t) =2 (2t — bk), k€Z, is an orthonormal 
basis of W;, the orthogonal complement of V; in 
Vat, that IS, 


V; o Wi — Vi [14] 


Since i Vi s are a multiresolution analysis, we have 
y= ,, W; and L? = @/=** W;. It is then clear 
that = ) i is an orthonormal basis f L? (R), that is, 
for each function f € L?(R), we get the following 
decomposition: 


2393/71710 
= ie 


Let us see now how in practice a multiresolution 
analysis can be interpreted. Let f be a function in 
L^(R). We denote A>f (resp. Dyf) the operator 
which approximates f (resp. gives the details of f) at 


with f; , —(f , Yi) 12 


resolution 27. More precisely, Ayf (resp. Dyf) is the 
projection of f on V; (resp. on W;): 


k=+00 


Ayf (t) = ` (f; Djk) Pj R(t) 


k-——oo 


Af is characterized by the sequence of scalar 
products AZ f —(, Ó;k))cz, We call AS.f the 
discrete approximation of f at resolution 2’. 

In the same way, we have 


k=+-00 


Dyf(t)= X (fi vig)uja(t) 


k=—00 


Dif is characterized by the sequence of scalar 
products D4 f={(f, i; 4h c, 

We call Dif the details of f at resolution 2/. 
According to [14], approximation and detail are 
linked by the relation 


A»; f = Ayf + Df 


This means that Dyf represents the details to be 
added to obtain from one level of approximation to 
the next level of approximation. 

Finally, the decomposition of a signal f on a 
wavelet basis is obtained as an accumulation of 
details at scale 2/ from 0 to +00: 


J=+o0 j—--oo b—-4-oc 


f= 2, Daf= 5, » 4 


j——00 J=—00 k-—oc 


Wee) Wik [15] 


Instead of considering the sum over all dyadic 
levels j, one can sum over j > J for a fixed J; in this 
case, we have 


k=+00 k=+20 
f= jJ» Sf Vin) Vik + HM: Pk) Pp k 
k——oo j=] k——oo 


We conclude this section by showing how we can 
construct a 2D wavelet basis from the 1D case. We 
can simply use a tensor product. Scaling function 
and mother wavelet are given, respectively, as 
follows: 
P(x, y) = 


d(x)d(y), ^ w-— (vl, y y’) 


with 


As for the 1D case, Ayf denotes the projection of 
f on Vj, Dj, the horizontal details, D5, the vertical 


Figure 7 


Illustration on the wavelets methodology. 


details, and D3, the other details (the indice / in Ds, 
is the same as in v). For a 2D image f, we then have 
the following decomposition (see Figure 7): 


k=+00 
f= » S PN: Wiki Vik 
i! eY k—-—oc j>] 
k=+00 


i P3 (f Oy.k) Pp 


k=—o0 


Application to the Denoising Problem 


We go back to the denoising problem. Our goal is to 
solve this problem by using a variational approach 
and wavelets. We recall that we have an ideal image 
u that has been corrupted by a white Gaussian noise 
ri; resulting in an observation f with f =u + n. As it 
has been seen in the section “The variational 
approach,” this question can be tackled by solving 
the variational problem 


min{ AA (Iul) + If — 


cj 46] 


for suitable choices of E, G, and @. Here we propose to 
choose G = L^(Q) (Q is the domain image) and for E 
the Besov space B{(L'(Q)) and ¢= Identity. Besov 
spaces Be (LP(Q)) are used in many domains of 
mathematics as harmonic analysis or approximation 
theory. There exist different ways for defining them. 
Roughly speaking, they consist of functions having a 
derivatives in L” (Q); the third parameter q allows one 
to make finer distinctions in smoothness. Here we are 
only concerned with the Besov space B}(L'(Q)). One 
important property needed here is that the norm of a 
function in E — Bi (L' (Q)) is equivalent to the /'-norm 
of the wavelet coefficients, that is if [v;,] is an 
orthonormal basis of L^(Q) and if u; g ,, are the wavelet 


coefficients of u € E, then |u|; — » 7; dog y Ujk ul. 


Remark When one is concerned with a finite 


domain, then some changes must be made with 
respect to the construction given in [15] to obtain an 
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image 


Figure 8 lllustration of two regularization methods. 


orthonormal basis of L^(Q). To avoid further 
technical complications, we ignore this question. 
A 


Let us denote, respectively, by (u; k y} and {fik u) 
the wavelet coefficients of u and f, then solving [16] 
amounts to finding the minimizer of the functional 


F(u) = A *. [tti kyl + |» |t; kap — AT a [17] 


jka} jk 


One notes immediately that minimizing problem 
[17] reduces to finding the minimizer s, given £, of 
E(s) — |s — t|? + A|s| and that the minimizer of E(s) is 
given by s=t—(A/2) if t > A/2,s=0 if |t| € A/2 
and s —t + (4/2) if t < —(A/2). 

Thus, we shrink the wavelet coefficients fj, y 
toward zero by an amount of A/2 to obtain the 
minimizer. This is exactly the wavelet shrinkage 
algorithm of Donoho and Johnstone (1994). It is 
remarkable that the wavelet shrinkage algorithm, 
which has been found by using statistical tools, can 
also be explained via a variational approach 
(Chambolle et al. 1998). Figure 8 shows an example 
of the result on a noisy image. 

For more details, we refer the reader to Mallat 
(1998). 


Conclusion 


Image processing is a challenging domain of applied 
mathematics which has to deal with discrete and 
continuous representations. In this article, we have 
covered the core mathematical tools used in the 
area. The example of gray-scale image restoration 
allowed us to illustrate and compare the different 
methodologies. Naturally, as mentioned in the 
introduction, image processing refers to a wide 
variety of applications and an intensive research 
has been carried out on the different topics using the 
methodologies described here. The reader will find 
in the references (therein) several illustrations of 
challenging problems. 


See also: l-Convergence and Homogenization; Convex 
Analysis and Duality Methods; Elliptic Differential 
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Equations: Linear Theory; Evolution Equations: Linear 
and Nonlinear; Fluid Mechanics: Numerical Methods; 
Fractal Dimensions in Dynamics; Free Interfaces and 
Free Discontinuities: Variational Problems; Geometric 
Measure Theory; Ginzburg-Landau Equation; 
Inequalities in Sobolev Spaces; Minimax Principle in the 
Calculus of Variations; Optimal Transportation; Partial 
Differential Equations: Some Examples; Stochastic 
Differential Equations; Variational Techniques for 
Ginzburg-Landau Energies; Wavelets: Applications; 
Wavelets: Mathematical Theory. 
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! Incompressible Euler Equations: Mathematical Theory 


The motion of homogeneous incompressible ideal 
fluid in a domain Q C R” is described by the 
following system of Euler equations: 


x + (v.V)v--Vp [1] 
divy = 0 [2] 
v(x,0) = vo(x) [3] 


where v—(vl,v^,...,V"), v =v (x, t),j —1,2,...,n, 


is the velocity of the fluid flows, p — p(x,t) is the 
scalar pressure, and vo(x) is a given initial velocity 
field satisfying div vg — 0. Here we use the standard 
notion of vector calculus, denoting 
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Oxi Ox.’ Ox, 
n Ov 
V) = i 
x Ee OX, 
n k 
div v = n 
p] Uk 


Equation [1] represents the balance of momentum 
for each portion of fluid, while eqn [2] represents 
the conservation of mass of fluid during its motion, 
combined with the homogeneity (constant density) 
assumption on the fluid. Equations [1] and [2] are 
first obtained by Euler in 1755. Although we could 
consider, more generally, the inhomogeneous incom- 
pressible Euler equations, in mathematical fluid 
mechanics considerations the incompressible Euler 
equations usually mean the above system [1]-[2]. 
For a bounded domain with fixed boundary 9€), the 
natural boundary condition is 


v(x,t)-v(x) 20 V(x,t) eO x [0, œ) {4 


where v(x) is the unit normal vector at the boundary 
point x € N. Several studies are concerned with the 
Cauchy problem of the system [1]-[3], where we 
consider the case 


R"(whole domain of R"), or 
Q= aiii | [5] 
R” /Z” (periodic domain) 
In this article for simplicity we suppose 


Q=R",n=2,3 unless otherwise stated. We note 
that the Euler equation is obtained formally by 
setting the viscosity = 0, or, equivalently, Reynolds 
number -— oo in the Navier-Stokes equations. Thus, 
we may view the Euler equations as the one 
describing approximately the extremely high 
Reynolds number turbulent flows. For detailed 
mathematical studies on the finite Reynolds number 
Navier-Stokes equations, see Temam (1984) and 
Lions (1996). For much shorter and more compre- 
hensive review see Constantin (1995). In the study of 
the Euler equations the notion of vorticity, w = curl v, 
plays a very important role. In particular, we can 
reformulate the system in terms of vorticity fields 
only as follows. We first suppose we are working in 
three-dimensional (3D) space, and rewrite [1] as 


Ov 
oY xeurly=-V(p +5 |v| " [3 


Taking curl of [6], and using elementary vector 
identities we obtain the following vorticity formulation: 
Ow 


apt Vw=w- Vv [7] 


div v — 0, curl v = w [8] 


w(x, 0) = wo(x) [9] 


The linear elliptic system [8] for v can be solved 
explicitly in terms of w to give the Biot-Savart law 


v(x, t) E Se ais [10] 


Substituting this v into [7] formally, we obtain a 
integrodifferential system for w. The term in the 
right-hand side of [7] is called the “vortex stretching 
term," and is regarded as the main source of 
difficulties in the mathematical theory of the 3D 
Euler equations. In the 2D case we take the vorticity 
as the scalar, w—Ov^/Ox,4 — Ov!/Ox», and the 
evolution equation of w becomes 
Qu 


combined with ie 2D Biot-Savart law, 


1 (—y2 F X2,31 — X1) 
u(x,t) => | Ey [12] 
JR? ix — y| 
In many studies of the Euler equations it is 
convenient to introduce the notion of “particle 
trajectory mapping," ®(-,t) defined by 


Oc(o, t) 
cC v(P(a, t), t) 13 
$(o,0)—a a € Q 


The mapping ®(-,¢) transforms from the location of 
the initial fluid particles to the location at time f, 
and the parameter a is called the Lagrangian particle 
marker. If we denote the Jacobian of the transfor- 
mation, det (V,(o,£)) — (o, t), then we can show 
easily that 
= (divv)] 

which implies the fact that the velocity field v 
satisfies the incompressibility, div v = 0 if and only if 
the mapping ®(-,f) is volume preserving. At this 
moment, we note that, although the Euler equations 
are originally derived by applying the mass con- 
servation and the momentum balance principles, we 
could also derive them by applying the principle of 
least action to the action defined 


ee) 


ti 


Here, $(.,7)::0—€ is a qaem family of 
volume-preserving diffeomorphism. This variational 
approach to the Euler equations implies that we can 
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view solutions of the Euler equations as a geodesic 
curve in the L*-metric on the infinite-dimensional 
manifold of volume-preserving diffeomorphisms (see 
for more details, e.g., Arnol'd and Khesin (1998)). 
The 3D Euler equations have many conserved 
quantities. We list some important ones below. 


1. Energy 
Bit) = J) Iv(x, t)|^ dx [14] 
Jo 
2. Helicity 
H(z} = l v(x,t) -w(x,t) dx [15] 
JQ 


3. Circulation 


PME $ v. di 16] 
C(t) 


where C(t) = {®(a, t)|a € C] is the curve moving 
along with the fluid. 
4. Impulse 


I(t) =5 | x x ds [17] 


JQ) 


5. Moment of impulse 


MG) - | xx (exw) ds [18] 


The proof of conservations of the above quantities 
can be carried out without difficulty by using 
elementary vector calculus (for details see, e.g., 
Chorin and Marsden (1993), Majda and Bertozzi 
(2002), Marchioro and Pulvirenti (1994)). The 
helicity above, in particular, represents the degree 
of knotedness of the vortex lines in the fluid, where 
the vortex lines are the integral curves of the 
vorticity fields. Arnol'd and Khesin (1998) discuss 
in detail aspects of helicity and other geometric 
aspects of the Euler equations. For the 2D Euler 
equations there is no analog of helicity, while the 
circulation conservation is replaced by the vorticity 
flux integral, 


J w(x, t) dx [19] 
A(t) 


where A(t)={®(a,t)|a € A} is a planar region 
moving along the fluid. The impulse and the 
moment of impulse integrals are replace by 

l 


— / (x2, —x1)w dx [20a] 
JQ 


2 


and 


os | es 20b 
3 Jo 
respectively. 
In the 2D ideal incompressible fluids we have 
extra conserved quantities; namely for any p€ 
[1,06] the integral 


| J wote £l dx 21 


is conserved (as a matter of fact we can extend this 
statement by replacing the integral by |, f (w(x, t))dx 
for any continuous function f). There are many 
known explicit solutions to the Euler equations (See 
e.g., Lamb (1932) and Majda and Bertozzi (2002)). 


Local Existence and the Blow-Up 
Problem 


The Classical Results 


We first introduce some notations of function 
spaces. The Lebesgue space L?(Q), p € [1,20], is the 
Banach space defined by the norm 


ess. supyeo |f(x), p = oo 


flue = | (Jo fG)"dx) "^, pe [1,00) 


Let us set a@:=(a4,Q2,.-.,Q,) € (Z4 U(0]" with 
la| 201 -F à2 --::: o o. Then, D^:— D? D» Deo, 
where D;—0/0xj,j—1,2,...,". For given REZ 
and p € [1,06) the Sobolev space, W*?(Q) is the 
Banach space of functions consisting of functions 
f € L?(Q) such that 


L/p 
All wen = fi err à) < O0 


where the derivatives are in the sense of distribu- 
tions. For p —oo we replace the L’-norm by the L* 
norm. In order to cooperate with the fractional 
derivatives of order s € R, we use the space L5(Q) 
defined by the Banach spaces norm, 


lif lies = [1 — AY" flus 


where (1— Ay"?f - I + E FNE] with 
F(-) and F'(-) denoting the Fourier transform 
and its inverse. Below we outline the key ideas of 
proving the local existence theorems for the Euler 
equations. For more details we refer the reader to 
Majda and Bertozzi (2002). For simplicity, we use 
the function space H"(R") — W"^?(R"), 1 — 2,3. 
Taking derivatives D^ on [1], and then taking its 
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L^ inner product with D°v, and summing over the 
multi-indices o with |a| € m, we obtain 


tm =— X (D^(v- V)v— (v - V)D"v, D°v) 
lo | m 
— » ((v- V)D*v, D^v),; 
la | <m 
= * (D* V p. D^v); 
la| m 


=+ I4 1 


5 4; P 


By integration by parts, we obtain 


III — — 5 (D*p, D^divv),; = 0 


lo X m 


Integrating by parts again, and using the fact that 
div v — 0, we have 


l à 
I=-3 2 f wv v|' dx 


la|zm 7 R 


-E | 


la| m * 


. div v|D^v|^ dx = 0 


We now use the so-called commutator type of 
estimate, 


X ||D° (fg) — fD^gl|,; 
la m 


€ C(lIVf lys lgl + lf ls ly) 


and obtain 


1< Y Dew: Vv — (v- VDA [lolly 
la | m 


2 
€ CI Vv||, s [Ul 


Summarizing the above estimates, I-III, we have 


d 
3; nllo < Cll Vell llli 22] 
Further estimate, using the Sobolev inequality, || V ||, x 
< C]|v||;js for m > 5/2, gives 


d 2 3 
4; llli < Cllvllas 


Thanks to Gronwall's lemma, we have the local-in- 
time uniform estimate 


Uo || a 


i 5 SON eee 
1 — Ct||vo || py» 


V(t) [lpm < 
for all t € [0, 1/(2C]|vol|j»)]. This is the key a priori 
estimate for the construction of the local solutions. 
The local-in-time solution of the Euler equations in 
the Sobolev space H"(R") for m > n/2 + 1,m € Z, 


was obtained by Kato (1972). For the above- 
constructed local-in-time solutions, one of the 
most outstanding open problems in mathematical 
fluid mechanics is whether the solution can be 
continued to any future time up to infinity, or the 
solution will lose regularity and blow up in finite 
time. Even in terms of numerical experiments, the 
answer is not yet settled down. In the direction of 
solving this problem there is a celebrated results, 
called the Beale-Kato-Majda criterion (1984), 
which states 

if and only if 


lim sup ||v(t)|| 4s = oc 
» 


A M 
J lus) || ds = oc [23] 


We outline the proof of this result below (for more 
details see Majda and Bertozzi (2002)). We first 
recall the Beale-Kato-Majda's version of the loga- 
rithmic Sobolev inequality, 


[Velle Cll], ~(1 + log(1 + vils)) + Cle: [24] 


for m> 5/2. Now suppose b lw(£)|| i. dt < oo. 
Taking L^ inner product of [7] with w, then after 
integration by part we obtain 


1 d 


DETA = ((w: V jv, w) 


< [Illis Vere |l 


2 
w)|| 72 


"n 


= ||| L* 


where we used the identity ||Vv||,; = |wl|j;. Apply- 


ing the Gronwall lemma, we obtain 


Ts 
leto: < nli exp( f hoolis ds) 
2 Cha T. 25] 


for all t € [0, T,]. Substituting [24] into [22], and 
combining this with [25], we have 


d 


Pala 
2 
< CH + [wll [1+ log(t + [Jo] ym) lllo lle 


Applying the Gronwall’s lemma, we obtain 
le (t) || pgm = |vo || py 
"s pi 
cens [prar] 
0 
< Chto, T.) 


for all £ € [0,T,] and for some constants C4, Co. 
Thus, we proved the “necessity part” of [23], The 
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“sufficiency part" is an easy consequence of the 
Sobolev inequality, 


Ts 
J le(s)]lj. ds < T. sup ||Vv(t)l|,« 
0 0<?t<T, 


< CT, sup ||v(£)]| n» 


Oct«T, 


for m > 5/2. 


Other Related Results 


The previous local existence result in H"(R"), m > 
n/2 + 1, is basically due to T Kato in 1972. He and 
G Ponce extended this existence result using the 
fractional Sobolev space, L5 (R"), s >n/2+1,sER 
in 1986. These results were further extended, using 
the Besov and the Triebel-Lizorkin spaces, by the 
present author in 2001. 

For bounded domain 2 c R”, R Temam obtained 
the local-existence result using the space W^^(Q) in 
1975. On the other hand, in the setting of the 
Holder space, C'^?^(R") L Lichtenstein (1925) and 
W Wolibner (1933) obtained local existence of 
solutions of the Euler equations. More recently, 
J-Y Chemin considered the Zygmund C'(R"), which 
is identical to the Holder space Chhs-ll(R") for 
noninteger s, where [s] means the largest integer not 
greater than s, but is different from C*^?(R") for 
integer s. He proved, in 1992, local existence of 
solutions to the 3D Euler equations in this space in 
1992. See Chemin (1998) for details of this proof. 

The Beale-Kato-Majda criterion for the finite- 
time blow-up of the classical solutions of the 3D 
Euler equations has been refined recently by many 
authors; replacing the L?*-norm of vorticity w(x, f) 
by the weaker BMO (the space of functions with 
bounded mean oscillations) norm (H Kozono and 
Y Taniuchi, 2000), and by the even weaker Besov 
space or Triebel-Lizorkin space norms by the 
present author in 2001 (see Triebel (1983) for 
more details on those spaces). Here we just note 
that these spaces are refinements of the usual 
Sobolev spaces. For a bounded domain case, there 
is a result by A Ferrari in 1993. The blow-up 
problem is still open even in the case of axisym- 
metric 3D Euler equations if there is a nonzero swirl 
(angular velocity). In this case, the blow-up is 
controlled only by the angular component of the 
vorticity as shown by the present author (1996). In 
the region off the axis, in particular, the axisym- 
metric 3D Euler equation has the same form as the 
2D Boussinesq equations. 

Some researchers also tried to approach to 
regularity/singularity problem of the 3D Euler 
equations by investigating the geometric structure 


of the vortex stretching term, and obtained a 
geometric type of blow-up criterion (P Constantin, 
C Fefferman, and A Majda, 1996). For more 
detailed review of studies in this direction see 
Constantin (1995). 

Since the blow-up problem of the 3D Euler 
equation itself looks too difficult to solve, it has 
also been studied on the simplified model problems. 
In 1985, P Constantin, PD Lax, and A Majda 
considered the following 1D model problem of the 
3D Euler equations: 

0, + (H(@)0). = 0, 


x 


A(x, 0) = A(x) 


where H(-) is the Hilbert transform defined by 


TUN NC vf. AY) dy 


T o * — y 


They proved finite-time blow-up of this model 
problem by explicitly obtaining the solution. There 
is another, 2D model problem of the 3D Euler 
equations, the quasigeostrophic equations, 


u-Vi, | 0-2-(-AyJ Py [26] 
A(x, 0) = Oo (x) 


where V- —(—05,01). Contrary to the above 1D 
model equation, this 2D model has real physical 
relevance in the atmospheric science, and 6(x,t) 
represents the temperature of the air. The resem- 
blance of this equation to the 3D Euler equation 
was first observed by P Constantin, A Majda, and 
E Tabak in 1994, and they derived the finite blow- 
up criterion of the equations. In spite of many 
interesting partial results, including the work by 
D Cordoba (1998), the blow-up problem of [26] is 
still open. 


The 2D Euler Equations and the 
Weak Solutions 


The Case of W':? Weak Solutions 


In 2D Euler equations, the problem of global well- 
posedness of the classical solutions is settled down. 
This is an immediate consequence of the conserva- 
tion of ||w(t)||;~ as stated in [21] combined with the 
Beale-Kato-Majda criterion [23]. On the other 
hand, the notion of weak solutions is not well 
understood. A weak solution of the Euler equations 
is a singular (nondifferentiable) solution of the 
equations. More precisely, by a weak solution of 
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[1]-[2] in Q x (0, T) we mean a vector field v € 
C([0, T); L?..(Q)) satisfying the integral identity: 


d res 


-J v(x,0)-ġ(x,0)dx 
R3 


aaa ed 


- f J u(x,t) @ v(x,t): Volx,t)dxdt=0 [27a] 
0 R^ 


T 
| f v(x, t) - Vy(x,t)dxdt = 0 
0 JR? 


for every vector test function ó—(61,05,...,04,) € 
Cx(Q x [0, T)) satisfying div ¢=0, and for every 
scalar test function € C (Q x [0, T). Here we 
used the notation (u&v);—uj;vj and A:B= 
Pi-iAgBg for nxn matrices A and B. We 
observe that [27a] and [27b] are obtained by 
multiplying ó and v to [1] and [2], respectively, 
and integrating by parts. Thus, even the locally 
square-integrable vector fields, which are not differ- 
entiable in the classical sense, could be solutions of 
the Euler equations. For the general 3D Euler 
equations, we do not yet have the global existence 
theorems for the weak solutions. Actually, it is even 
suggested that we need more weaker notion of 
solution (the so-called “measure-valued solutions") 
to describe generic global solutions for the 3D Euler 
equations. For the 2D Euler equations, however, we 
have global existence theorems for wọ € L'(R?)n 
L?(R*) for p € [1,00]. This better situation of 2D 
Euler equations compared to the 3D case for the 
weak solutions is mainly due to the conservation 
law of L?-norm described in [21]. Here we present 
briefly the existence proof of the weak solutions for 
2D Euler equations in the simplest situation. We will 
prove the „global existence of weak solutions for 
wo € LP(R?), Le paon Let p.(x)— (1/&?)p(x/&), 
where p € CX(R?) is a standard eii satisfying 
p 20, supp p C (x € R?||x| < 1}, and [p pdx -— 1. 
Let vo be the velocity associated with the initial 
vorticity wo, given by the Biot-Savart law [12]. 
"rn. the iie ~ of initial data vj(x)-— p. * 
vo(x = Jn: p-(x — y)vo(y) dy. For each v we have 
arat -in-time smooth solutions v(x, t). Moreover, 
thanks to [21], we have the following estimate of the 
vorticity that is uniform in £: 


le lli» = lell < llwollz; [28] 


where we used the property of the mollifier in the 
second inequality. If we take the (distributional) 
derivative of the Biot-Savart law [12], we find 
Vv= K *w+ Cw, where K(x) is a kernel function 


[27b] 


defining a singular integral operator of the convolu- 
tion type, and C is a constant vector. The well- 
known Calderon-Zygmund inequality implies that 


[Vell S Collellr, [29] 
Combining [28] and [29] we have 


sup | V^(£)llj, < C(vo), YT >0 [30] 
Oct«T 


namely the sequence [z^] is uniformly bounded in 
L*(0, T; W^P(R^)). Next, we claim that {v*} satis- 
fies the inequality 


ll^ (61) — ^ (t)ar € Clivoligllts —t2] — [31] 


for all £j,7; with 0 < t; € tj < T, where C is an 
absolute constant. Here the negative-order Sobolev 
space H "(0),m » 0, is defined as the dual of 
Hj (0), and can be identified with the space of 
functions Cz (Q2) completed with metric in H” (Q). 
Indeed, let ó € CX(R*). Taking L^(R^) inner pro- 
duct of [1] with ó we have the estimates 


Ov* (x, t) 
[Ae gaa 


<| S wr dx] +f o vs 


"reel. nA ane 
Eit) OGIPANZZM 
eu ae (t) "ben [32] 


where we used the Sobolev inequality || Vó||j.. < 
C|lól,s and the energy equality in the last step. 
Since [32] holds for all ó € CX(R*), by taking the 
closure of CARN in H3(R*) we obtain 


dv (t) 
dt 


< Clp*(Dlu-:--lvollzs — [33] 
H-2 


We now estimate 
operation on [1], we have the Poisson equation 

Ap* = —div( 
Let n € CER. then 


n )dx] = 


"ER Vv") 


nid : Vv. )n dx 


/ (v^ - VA -Vndx 
R? 


f (^ - V)Vn - v? dx 
R* 


< |v (Hill A? nll 
< Cllvollz2 linlia [34] 
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where we used the energy equality [14] and the 
Sobolev inequality in the last step. Since [34] holds 
for all n € C?(R*), taking the closure of Cz (R^) in 
H^(R*), we obtain 


| , A0 GG) dx| < Coolie 
vn € H*(R?) (35] 
Thus, 


IAP lla- < Cllvollz2 vt € [0, T) 


This provides us with 
lo C las < ID? CO lus < CAP (Ol us 
< Cl|vol[;: 
Combining [33] with [36], we obtain 


dv (t) 
dt 


sup 
O<t<T 


< Cl|voll7: 
H3 


Thus, from 


we have 


dv (t) 


|i (ti) — v^ (t2)||g 2 € sup Iti — t| 
Oct«T H-2 
< Cl|vollrs|ti — tzl 


Thus, [31] is proved as claimed. Thanks to the 
Aubin-Nitsche compactness lemma together with 
[30] and [31] we have a subsequence, denoted by the 
same notation, [7^] and v in L*(0, T; W'^?(R^)) such 
that 


v — v weakly — « in L*(0, T; W'?(R7)) [36] 
and 
V —v in LZ, (R? x (0, T)) [37] 


as e —^ 0. We know that as a classical solution each 
v' and v; satisfies 


l p(x, 0jvo(x)dx 
JR? 
T a 
d | / (hh - vo -- Vo : v. &w)dxdt = 0 [38] 
J0 JR? 
for all ¢ € CX(R* x [0, T)) with div@=0 and 


T 3 
J / Vy: i£ dxdt = 0 [39] 
JO JR? 


for all Y € CHR? x [0, T)). We can check easily that 
the convergence [36] and [37] is enough to pass to the 


limit £ — 0 in [38] and [39] to obtain the correspond- 
ing equations with vf and v; replaced by v and vo. 
Thus, v is a weak solution of the Euler equations with 
initial data vo. This completes the outline of the proof 
of weak solutions to the 2D Euler equations. 


Notes on Further Results 


The study of weak solutions of the 2D Euler 
equations was initiated by V Yudovich in 1963, 
where he proved the existence of weak solutions for 
initial data wo € L'(R^) n L*(R?). Subsequenthy, 
theory of weak solutions has been developed by 
studies of the vortex sheet problem due to DiPerna 
and Majda in 1987. For the existence of weak 
solutions to the vortex sheet initial data, namely 
the existence problem for initial vorticity wo € 
H^ (R2) n M(R2), where M(R?) is the space of 
Radon measures on R^, is still an outstanding open 
problem. The main physical motivation of this 
problem is to understand the dynamics of vortex 
sheets in the 3D turbulence. For this problem 
JM Delort proved existence assuming single- 
signedness of the initial vortex sheet in 1991. The 
proof is simplified by A Majda in 1993, using the 
conservation of moment of impulse. The result is 
also reproved by L C Evans and S Müller in 1994, 
using the weak compactness of the Hardy space. 
Later in 2001, MC Lopes Filho, HJ Nussenzveig 
Lopes and Z Xin allowed the change of sign for 
initial vortex sheet, but assumed special reflection 
symmetry to prove existence of global weak solu- 
tions. Related to this problem is the one of 
characterizing the precise borderline function space 
to which initial data belongs, and above which there 
is no concentration phenomenon for weakly approx- 
imating sequence of solutions; a recent analysis of 
this problem was done by E Tadmor in 2001. 

For the uniqueness problem of the weak solutions of 
the 2D Euler equations, there are remarkable works by 
V Scheffer (1993) and A Shnirelman (1997), where 
they constructed explicitly an L? (R^ x R) weak 
solution starting from zero initial data. Also M Vishik 
(1999) extended the uniqueness class of the weak 
solutions of the 2D Euler equations, improving 
previous work by V Yudovich (1995). The class 
found by M Vishik, in particular, includes the BMO. 
There is another problem closely related to the weak 
solutions of the 2D Euler equations, called the vortex 
patch problem. The main question was if there is any 
singularity of the boundary of a patch 
Q(t) — (X(a, t) | o € Qo}, where X(a,t) is the particle 
trajectory mapping generated by a weak solution 
v(x,t), which is evolving from the initial data 
wo(x) = xo, (x), the characteristic function of set Qo 
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with smooth boundary. The problem itself is well 
defined, due to the work of V Yudovich (1963), and 
there exists unique particle trajectories associated with 
such weak solutions. The problem was settled by J-Y 
Chemin in 1991. He proved the global-in-time 
preservation of the C' regularity of the boundary 
OQ(t), contrary to the previous numerical experiments. 
The proof of this result was later simplified by A 
Bertozzi and P Constantin in 1993. 

Another interesting problem related to the weak 
solutions of the Euler equations (2D or 3D) is 
whether or not the energy is preserved for the weak 
solutions, namely if there is any "intrinsic dissipa- 
tion" to the singular solutions of the ideal fluids. In 
1949, L Onsager conjectured that if the weak 
solution of 3D Euler equations belongs to certain 
Holder space, then the energy is conserved. This 
conjecture, in the setting of Besov space, was 
proved by P Constantin, W E and E S Titi in 1994. 
This question of possibility of dissipation of energy 
for weak solutions is further studied by J Duchon and 
R Robert in 2000. Later, in 2003 the present author 
considered the problem of helicity conservation for 
the weak solutions of the 3D Euler flows, which is 
related to the question of crossing/reconnections of 
the vortex tubes for weak solutions, and showed that 
for large class of weak solutions in certain Besov 
spaces the helicity is preserved. 


See also: Compressible Flows: Mathematical Theory; 
Evolution Equations: Linear and Nonlinear; Fluid 
Mechanics: Numerical Methods; Interfaces and 
Multicomponent Fluids; Intermittency in Turbulence; 
Inviscid Flows; Non-Newtonian Fluids; Partial Differential 
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Introduction 


If, in a problem of quantization, state spaces with 
indefinite inner product are used instead of Hilbert 
spaces, one speaks of quantization with indefinite 
metric. The main domain of application is the 
quantization of gauge fields, like the electromagnetic 
vector potential A,(x) or Yang-Mills fields in quan- 
tum chromodynamics (QCD) and the standard model. 

The conceptual problem with the indefinite metric 
is the occurrence of senseless negative probabilities 
in the formalism. Such negative probabilities, 
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Stochastic Hydrodynamics; Turbulence Theories; 
Viscous Incompressible Fluids: Mathematical Theory; 
Vortex Dynamics. 
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however, only arise in expectation values of fields 
that are not gauge invariant and hence do not 
correspond to observable quantities. Equivalently, 
the inner product of vectors generated by applica- 
tion of such fields to the vacuum vector with itself 
can be negative or null. In order to extract the 
observable content of an indefinite-metric quantum 
theory, a subsidiary condition is needed to single 
out the physical subspace. Restricted to this subspace, 
the inner product is positive semidefinite. This 
subsidiary condition can be seen as the implementa- 
tion of a gauge, as, for example, the Lorentz gauge 
O,A"(x)—O0 in quantum electrodynamics (QED). 
This procedure is also known under the name 
Gupta-Bleuler formalism. 

The use of indefinite metric in the quantization of 
gauge theories like QED can be avoided entirely. 
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This is called quantization in a physical gauge. The 
problem with such gauges is that they are not 
Lorentz invariant and that the vector potential A"(x) 
is not a local field. An example is the Coulomb 
gauge defined by Ao(x) =0 and 9'A;(x) 2 0 in QED. 
Furthermore, Dirac spinor fields (x) in such gauges 
do not anticommute when localized in spacelike 
separated regions. The Dirac fields therefore are also 
nonlocal quantities. Although not in contrast with 
special relativity, as Dirac spinors and the vector 
potential are not gauge invariant and hence are 
unobservable, this leads to severe technical problems 
in the formulation of interacting theories. In 
particular, the theory of renormalization heavily 
uses both locality and invariance. Therefore, the 
Gupta-Bleuler formalism generally is the preferred 
quantization procedure for a gauge theory. 

That a local and invariant quantization is not 
possible using a (positive-metric) Hilbert space has 
been proved by F Strocchi in a series of articles 
published between 1967 and 1970. If one wants to 
preserve locality and/or invariance of the quantized 
field theory, it is thus strictly necessary to give up 
the positivity of the state space. 

A short digression into the early history of the 
idea might be of interest. It dates back to 1941, 
where the use of indefinite metric in the quantiza- 
tion of relativistic equations was proposed by Paul 
Dirac in a lecture at the London Royal Society. The 
negative probabilities for the bosonic vector poten- 
tial were thought to be connected with the problem 
of negative-energy solutions of relativistic equations 
as a type of surrogate of the “Dirac sea" in the 
quantization of fermions. Furthermore, Dirac pro- 
posed that negative-energy solutions and negative 
probabilities would jointly lead to the cancellation of 
divergences in QED. The latter idea was taken up by 
W Heisenberg in his lectures on the theory of 
elementary particles held in Munich in 1961, but the 
generally accepted solution to the problem of ultra- 
violet divergences was achieved without recourse to 
Dirac's original motivation. In 1950 the consistent 
quantization of vector potential in the Lorentz gauge 
was formulated by SN Gupta and K Bleuler 
eliminating the use of negative-energy solutions. 
Since then the indefinite metric has become a building 
block of the standard theory of quantized gauge fields. 


No-Go Theorems 


The strict necessity of the Gupta-Bleuler procedure 
for the local or covariant quantization of gauge 
fields has been demonstrated by F Strocchi in 
the form of no-go theorems for positive metric. 
Here we review their content for the case of the 


electromagnetic field. Related statements can be 
obtained for nonabelian gauge theories. The main 
problem lies in the fact that standard assumptions 
on the quantization of relativistic fields are in 
conflict with Maxwell equations that should hold 
as operator identities in a positive-metric theory 
containing no unobservable states. Let 


E, E) 5/0, A, 06) — 0,4, (x) [1] 


be the quantized electromagnetic field strength 
tensor. Classically, the existence of A„(x) is guaran- 
teed from the first set of Maxwell equations 
c^? QE, (x) =0. Here (and henceforth) indices are 
raised and lowered with respect to the Minkowski 
metric gag and e^?" is the completely antisymmetric 
tensor on R^, Furthermore, we apply Einstein's 
convention on summation over repeated upper and 
lower indices. Standard assumptions from axiomatic 
quantum field theory are: 


1. The field strength tensor F,,(x) is an operator- 
valued distribution acting on a (dense core of a) 
Hilbert space H with scalar product (.,.) — in the 
indefinite-metric case, (.,.) only needs to be an 
inner product. 

2. F (x) transforms covariantly, that is, there is a 
strongly continuous unitary (with respect to (., .)) 
representation U of the orthochronous, proper 
Poincaré group on H such that for translation a € 
R^ combined with a restricted Lorentz transfor- 
mation A, one has 


U(a, X)F,,(x)U(a, A) ' 


= (AP A" Epl +a) A 

3. There exists a unique (up to multiplication with 
C-numbers) translation invariant vector Q € H 
(the “vacuum”), that is, U(a, 1)0 — Qva € Rf. 

4. The representation of the translations fulfills the 
spectral condition 


J (®, U(a, 1)W)e^" da = 0 i3 
JR‘ 


VV, $ € H if p is not in the closed forward light 
cone V' ={p € Rf: p-p >0,p° > 0). Here the 
dot is the Minkowski inner product. 
So far the assumptions concerned only observable 
quantities. In the following, we also demand. 

5. The vector potential A,(x) is realized as an 
operator-valued distribution on # and trans- 
forms covariantly under translations 


U(a, 1)A,(x)U(a, 1) ! = A,(x +a) (4| 


The assumptions on the nature of the vector 
potential so far are rather weak. Strocchi’s no-go 
theorems show that one cannot add further desirable 
properties as Lorentz covariance and/or locality 
without getting into conflict with the Maxwell 
equations: 


Theorem 1 Suppose that the above assumptions 

(1)-(3) and (5) bold. If Maxwells equations in the 

absence of charges, 
eMMOSF,,.(x) = 0, OF .»(x) =0 [5] 


are valid as 6 operator identities on 'H and the gauge 
potential transforms covariantly 


thw 
= (A ) 


the two-point function of tbe electromagnetic field 
tensor vanisbes identically: 


U(a, A)A,,(x)U(a, A)! A,(Ax+a) [6] 


(2, F,,(x)F,(y)Q) 20 Vx,y ER? [7] 


To gain a better understanding, where the difficul- 
ties in the quantization of the Maxwell equations 
arise from, here is a rough sketch of the proof: 
Maxwell equations and covariance imply that 
fua — y) (0, A, (X)F,G(y)0)) fulfills O*0,/,.,(%) 
— 0 and hence its Fourier transform has support in 
the union of the forward and backward light cone. 
The Fourier transform thus can be split into a 
— À and a  negative-frequency part, and 
fwo — fu, + f;, accordingly. By the general analysis 
of axiomatic field theory (see Axiomatic Quantum 
Field Theory), the functions f7;,, are boundary values 
of complex analytic functions on certain tubar 
domains 7 ^ transforming covariantly under a certain 
representation of the complex Lorentz group. By a 
theorem of Araki and Hepp giving a general 
representation of such functions and using the 
antisymmetry of the field tensor, the following 
formula can be derived: 


rg) = (ugoy xe Eau, f^ (z) T Capa P (t) 
z€77 [8] 


with f*,/* invariant under complex Lorentz trans- 
foriations. — uM values in 7^, one 
obtains  fuvp = (gupO, — guvOp)f + Enwp O^ b, p 
f=f*+f- and h=h’ +h, where the bar stands 
for the distributional boundary value. Maxwell's 
equations imply o” fup — (0"0,g,, — Orp) f =0 and 
c, "^ Osh i, = (0 rgan — 0,0,)b —0. The only Lor- 
entz-invariant solutions to these equations are 
constant, which implies the statement of Theorem 1. 

The second no-go theorem eliminates the assump- 
tion that the vector potential A,(x) is covariant; 
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however, a local gauge is assumed. The result is the 
same as in Theorem 1: 


Theorem 2 Suppose that tbe above assumptions (1)— 
(5) and Maxwell's equations bold as operator iden- 
tities on H. If, furthermore, the gauge is local, that is, 


[A (x), Av(y) 20 if x—yis spacelike — [9] 
the two-point function of the field strength tensor 
vanishes again as in Theorem 1. 


Analyzing the interplay of the covariance proper- 
ties of F,,(x) with the locality of A,,(x), Strocchi was 
able to show that the function f,,,(x — y) must have 
the same covariance properties as in Theorem 1, 
which implies the assertion of Theorem 2. 

The first two no-go theorems deal with the free 
electromagnetic field that is not coupled to charge- 
carrying fields. This is, of course, already a real 
obstruction also for an interacting theory, since, by 
the LSZ formalism, one expects the asymptotic 
incoming and outgoing fields Aim OBE (se). Pu out(x) tó 
be free. In fact, it has been proved by D Buchholz 
that, in the positive-metric case, such asymptotic 
fields can always be constructed. If one assumes a 
local and covariant gauge and positivity, the 
vanishing of the two-point function would also 
imply that the field F,,(x)—O identically by the 
Reeh-Schlieder theorem. 

The next no-go theorem shows that the problems 
connected to the quantization of the Maxwell 
equations are not connected only to the free 
electromagnetic fields. Let us assume that the second 
set of Maxwell equations is given by 


OR uix) = jv (x) [10] 


where j, is the leptonic current, that is, 
jy(x) — e: (x)y,v(x): in the case of QED, where v is 
the quantized Dirac field associated with electrons and 
positrons. Here :-: stands for Wick ordering and y, 
are the Dirac matrices, v! = v*4?. The conservation of 
the current Q"j, (x) — 0 implies that the current charge 


a=, 


is a constant of motion, where œ and x are 
compactly supported infinitely differentiable func- 
tions with f.o(x?)—1 and x(x)—1 for |x| « 1. 
Now, an alternative definition of charge, called 
gauge charge (it generates the global U(1)-gauge 
transformation), is given by 


Oc0 —0, [Oc.A,(x)) 20 and 
Qc, v(x)] = —ev(x) [12] 


x(x/R)jo(x?,x) dx? dx [11] 
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A third formulation of charge, the Maxwell charge 
Om, can also be given by replacing /^(x) in [11] by 
O,F" (x). Obviously, if Maxwell equations hold as 
operator identities, Oc = Oy. On observable states, 
all charges Oy, Oc, and Og ought to coincide. 
Strocchi's third theorem shows that this cannot be 
achieved within a local gauge: 


Theorem 3 If the Maxwell equations |9| bold and 
the Dirac field y(x) is local with respect to the 
electromagnetic field tensor F(x), that is, 


[F.,(x), v(y) 20 if x —y is spacelike [13] 


then [Om, v(x)] ^ 0, hence Oc = Om F Oc. 

The proof is a simple consequence of the 
observation that jo(x)=O"’F,9(x)=O'Fjo(x) is a 
three-divergence as Foo(x) — 0 by antisymmetry of 
F (x). Hence, 


(Qc. v(y)] 


R—o 


lim J. lolx), v(y)]la(x?)x(x/R) dx" dx 


- lim | [Fio(x). (y)Ja(x")O"x(x/R) 
x dx? dx = 0 [14] 


since, for R sufficiently large, the support 
of a(x')Ox(x/R) becomes spacelike separated 
from y. 

It should be noted that the proof of none of the 
above theorems relies on the definiteness of the 
inner product. The main clue of the indefinite-metric 
formalism, therefore, is rather to give up Maxwell 
equations as operator identities. In the usual 
positive-metric formalism, where all states in H are 
physical states, this would not be legitimate. But in 
indefinite metrics, many states are unobservable — in 
particular, those with negative “norm” (V, WV) < 0. 
On such states we can neglect the Maxwell 
equations. 


Axiomatic Framework 


The formalism of axiomatic quantum field theory 
(see Axiomatic Quantum Field Theory) requires a 
revision in order to cover the case of gauge fields. 
The necessary adaptations have been elaborated by 
G Morchio and F Strocchi, but also earlier work 
of E Scheibe and J Yngvasson played a significant 
role in this development. 

Let ó(x) be a V'-valued quantum field, where V 
is a finite-dimensional C-vector space with involu- 
tion *. The prime stands for the (topological) dual. 
For the case of QED, V is eight dimensional, 


containing four dimensions for the vector potential 
A,(x) and another four for the Dirac spinors 
Wx), v (x). 

Such a quantum field can be reconstructed from its 
vacuum expectation values (Wightman functions) as 
follows: let Sı —S(R^, V) be the space of rapidly 
decreasing functions f: R*— V endowed with the 
Schwarz topology. Then the Borcher's algebra S be 
the free, unital, involutive tensor algebra over S4, that 
is, $— C1 „>o S," with the multiplication induced 
by the tensor product and involution (fj ®---& 
fn) =f, @---@ ff. Sis endowed with the direct-sum 
topology. One can show that any linear, normalized, 
continuous functional W:S — C,W(1)—1, is 
determined by its restrictions W, to S;”". By the 
Schwarz kernel theorem, W, € S'(R^", V9"). Con- 
versely, any such sequence of Wightman distribu- 
tions W, determines a W. 

Given a Hermitian Wightman functional W such 
that W(f') - W(f, Vf € S, Lw ={f € S: Wh &f)— 
0 vb € S} forms a left ideal and the inner product 
W(f' bhb) induces a nondegenerate inner product 
(.,.) on Ho — S/ £ w. Furthermore, Borchers’ algebra 
S acts from the left on Hy. The quantum field o(x) 
defined as the restriction of this canonical represen- 
tation to the space S, C S according to ó(f)— 
“fei o" (x)f,(x)dx" where the index a runs over a 
basis of V. 

If the Wightman functional W has further proper- 
ties from axiomatic QFT (see Axiomatic Quantum 
Field Theory) like invariance with respect to a given 
representation of the Lorentz group on V, translation 
invariance, locality, and the spectral property, the 
quantum field @(x) fulfills the related requirements in 
analogy with the items (1)-(5) listed in the previous 
section for the case of the vector potential A,,(x). The 
Wightman distributions W, as in the positive-metric 
case are related to the vacuum expectation values of 
the theory by 


Worte ea sss Xu) — (OD OO Xp) OD) — [15] 
where Q is the equivalence class of 1 in Ho. 

The state-space Ho produced by the Gelfand- 
Naimark-Segal (GNS) construction for inner- 
product spaces might be too small to contain all 
states of physical interest. For example, in the QED 
case, it does not contain charged states (cf. Theorem 3). 
Depending on the physical problem, one might 
also be interested in constructing coherent or 
scattering states and translation-invariant states 
apart from the vacuum. Such states appear in 
problems related to symmetry breaking and confine- 
ment (the so-called O-vacua) or in some problems of 
conformal QFT (see Boundary Conformal Field 


Theory) in two dimensions. It, therefore, has 
become the standard point of view that one needs 
to make a suitable closure of Ho such that this 
closure includes the states of interest (for an 
alternative point of view, see the last paragraph of 
the following section). 

Typically, larger closures are favorable, as they 
contain more states. One therefore focuses on 
maximal Hilbert closures of Ho. A Hilbert topology 
T is induced by an auxiliary scalar product (.,.) on 
Ho. It is admissible, if it dominates the indefinite 
inner product |($, V)|^ < C(V, V)($, 6) VU, $ € Ho 
for some C » 0. This guarantees that the inner 
product extends to the Hilbert space closure H of 
Ho with respect to 7. Furthermore, there exists a 
self-adjoint contraction 77 on H such that (Y, n®) = 
(V, d) Vvo,v € H. A Hilbert topology 7 is maximal 
if there is no admissible Hilbert topology 7’ that is 
strictly weaker than Ho. The classification of 
maximal admissible Hilbert topologies in terms of 
the metric operator 7 is given by the following 
theorem: 


Theorem 4 A Hilbert topology T on Ho generated 
by a scalar product (.,.) is maximal if and only if the 
metric operator 1) bas a continuous inverse 1j on the 
Hilbert space closure H of Ho. In that case, one can 
replace (.,) by the scalar product (V, ), — (V, |n|®) 
without changing the topology r. The new metric 
operator m then fulfills ni = 1. 


For a proof of the first statement, see the original 
work of Morchio and Strocchi (1980). One can 
easily check that m =n t| which implies the 
second assertion of the theorem. A Hilbert space 
(H,(.,.)) with an indefinite inner product induced by 
a metric operator 7 with 7* — 15; is called a Krein 
space. For an extensive study of Krein spaces, see the 
monograph by Azizov and Iokhvidov (1989). 

Furthermore, one can show that given a nonmax- 
imal admissible Hilbert space topology 7 induced by 
some (.,.), one obtains a maximal admissible Hilbert 
topology as follows: given the metric operator n, we 
define a scalar product (W,®), — (V, (1 — Po)®) on 
H with Po the null space projector of 7. Obviously, 
this scalar product is still admissible and it leads to a 
new metric operator 7; and a new closure H, of Ho. 
Furthermore, it is easy to show that the scalar 
product (Y, 9), — (V, |m |®); still induces an admis- 
sible Hilbert topology which is also maximal, as 
m=m\n,'| clearly fulfills the Krein relation 
"n -— lt. 

The question of the existence of a Krein space 
closure of Ho, therefore, reduces to the question of 
the existence of an admissible Hilbert topology on 
o. The following condition on the Wightman 
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functions W, replaces the positivity axiom in the 
case of indefinite-metric quantum fields: 


Theorem 5 Given a Wightman functional W, there 
exists an admissible Hilbert space topology T on 
Ho =S/Lwy if and only if there exists a family of 
Hilbert seminorms p, on S, such that |W,, 
(f & b)| € palf)pmlh), Vn, m € No,f € S, b € Sm. 


In some cases, covering also examples with 
nontrivial scattering in arbitrary dimension, the 
condition of Theorem 5 can be checked explicitly 
(see Non-trivial Models of Quantum Fields with 
Indefinite Metric). 

It should be mentioned that different choices of the 
Hilbert seminorms p, lead to potentially different 
maximal Hilbert space closures (Hoffmann 
1998, Constantinescu and Gheondea 2001). In fact, 
often the topology is not even Poincaré invariant and 
hence the states that can be approximated with local 
states depend on a chosen inertial frame. This fact, 
for the case of QED, has been interpreted in terms of 
physical gauges. 

Many results from axiomatic field theory (see 
Axiomatic Quantum Field Theory) with positive 
metric also hold in the case of QFT with indefinite 
metric, like the PCT and the Reeh-Schlieder 
theorem, the irreducibility of the field algebra (for 
massive theories) and the Bisoniano—Wichmann 
theorem (see Algebraic Approach to Quantum Field 
Theory). Other classical results, like the Haag- 
Ruelle scattering theory and the spin and statistics 
theorem definitively do not hold, as has been proved 
by counterexamples. This is, however, far from 
being a disadvantage, as, for example, it permits the 
introduction of various gauges in the scattering 
theory of the vector potential A,,(x) and fermionic 
scalar *ghost" fields in the BRST quantization (see 
BRST Quantization) formalism. 


Gupta-Bleuler Gauge Procedure 


Here the Gupta-Bleuler gauge procedure is pre- 
sented in a slightly generalized form following 
Steinmann's monograph. Classically, the equations 
of motion for the vector potential A, (x), 


0", A, (x) + 0,0" A,(x) = j,,(x) 16] 


together with Lorentz gauge condition 
B(x)—O0,A"(x)—O0 imply the Maxwell equations 
[10]. Here, AER plays the role of a gauge 
parameter. As seen above, both equations, the so- 
called pseudo-Maxwell equations |16] and the 
Lorentz gauge condition B(x) — 0, cannot both hold 
as operator identities. The idea for the quantization 
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of the theory therefore is to give up the Lorentz 
gauge condition as an operator identity on the entire 
state space H. 

Suppose one has constructed such a theory with 
an indefinite inner state space 74. Already for the 
noninteracting theory, any invariant, spectral, local, 
and covariant solution requires indefinite metric, cf. 
the explicit formula [18] below. To complete the 
Gupta-Bleuler program, one needs to find a sub- 
space of (equivalence classes of) physical states H’ of 
the inner-product space H’ such that the following 
conditions hold: 


1. the vacuum is a physical state, that is, Q € H’; 

2. observable fields like j,,(x) and F,,(x) map H’ to 
itself; 

3. the inner product (.,.) restricted to H’ is positive 
semidefinite; 

4. observable fields map H”, the set of null vectors 
in H’, to itself; and 

5. the Maxwell equations hold on H’ in the sense 
(V, OF (x)®) = (Vj, (x)b), Vue [7 

Then one obtains HP® as the completion of the 

quotient space H’'/H". The physical Hilbert space 

HP? contains the vacuum Q (1), observable fields 

act on ?4?^ (2) and (4), it is a Hilbert space (3) 

and the Maxwell equations hold on it (5). 


To see that such a construction is possible, 
consider the noninteracting case j,(x)— 0, that is, 
the limit case of vanishing electrical charge e — 0, 
first. By taking the divergence of [16], one obtains 
(1 — A)0"0,0"A,(x) —O0. Excluding the Landau 
gauge (A=1), this implies (0"0,)*A,,(x) — 0. The 
most general solution for the two-point vacuum 
expectation values that is in agreement with [16] 
and the requirements of locality, translation invar- 
lance, the spectral condition, uniqueness of the 
vacuum, and the Lorentz covariance of A” (x) is then 


(Q, Ap (x)Ap(y)Q) 
= (—8yy + p9,0,)D'* (x — y) 
À 
i 0,0,E' (x — y) [18] 
1—A 

where D* and E* are the inverse Fourier 
transforms of 0(p?)6(p^) and (p?) (p?) respectively, 
p? =p - p, 0 being the Heavyside function, 6 the Dirac 
measure on R of mass one in zero and 4’ its 
derivative. p and A are gauge parameters, for 
example, the Feynman gauge corresponds to 
\=p=0. We have also omitted an overall factor 
corresponding to a field strength normalization 
(choice of numerical value of 5 — here 5 — 1). 


Using Wick's theorem and the GNS construction 
for inner-product spaces (cf. the preceding section), 
it is possible to realize a representation of the vector 
potential A,(x) as operator-valued distribution on 
some indefinite-metric state space H with Fock 
structure, for example, a Krein closure of the GNS 
space with Q the GNS vacuum and DCH the 
canonical domain of definition. In the case of 
Feynman gauge, the metric operator 7 can be 
obtained by a second quantization of the operator 
fa — 5 4 gufo on the one-particle space S4. 

In particular, the field B(x) acts as an operator- 
valued distribution on H and, by taking the 
divergence of [16], it follows that O"O,B(x) — O0. 
Thus, B(x) = B*(x) + B (x) can be decomposed into 
a positive (“annihilation”) and a negative (“crea- 
tion") frequency part B*(x). One obtains: 


Theorem 6 The space H' ={V € D: B^ (x)v — 0] 
fulfills all requirements (1)-(5) of the Gupta—Bleuler 
gauge procedure. 


Condition (1) is obvious and (2) follows from the 
fact that the fields F,,(x) and B(x) commute, which 
can be checked on the level of two-point functions 
[18]. In the same spirit, one can also use [18] to 
check (3) and (4) by explicit calculations on the one- 
particle space and showing that H’ is the Fock space 
over the one-particle states annihilated by B*(x). 
Finally, by Hermiticity of A"(x), B*(x)' = B7 (x) and 
thus (V, B(x)®) = (V, B+ (x)®) + (B*(x)W,4) —0. As 
the field B(x) stands for the obstruction to Maxwell 
equations, this implies condition (5). 

It should be noted that the physical state space 
HP? does not depend on the gauge parameters A, p 
and that it is spanned by repeated application of the 
field tensor F(x) to the vacuum. 

By current conservation, the divergence of [16] 
still yields 0"O,B(x) — O0 also in the interacting case 
where e £ 0. One can then choose the same gauge 
condition as in Theorem 6 to define H’. One can 
then try to prove that this space fulfills all the 
requirements of the Gupta-Bleuler procedure, for 
example, in the sense of perturbation theory. Using 
more advanced formulations as, for example, BRST 
quantization and Bogoliubov's local S-matrix form- 
alism, this program has been completed up to a 
solution of the infrared problem (see Perturbative 
Renormalization Theory and BRST). 

A different procedure, motivated by the necessity of 
coincidence of all charges Oc, Oc, and Oy on the 
physical state space, has been elaborated by Steinmann. 
It deviates from the standard procedure in the sense that 
the physical space H’ is not included in H, but H?" is 
directly obtained from the GNS procedure after taking 
certain limits of Wightman functions restricted to 
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certain gauge-invariant algebras constructed from the 
Borchers algebra and a limiting procedure in a gauge 
parameter. The Wightman functional on this gauge- 
invariant algebras are positive (in the sense of perturba- 
tion theory), the limiting procedure, however, implies 
that the so-obtained physical states are singular (i.e., 
have diverging inner product) to states in H, hence 
the so-defined state spaces corresponding to going to 
a physical gauge after solving the problem of a 
perturbative construction of an indefinite-metric solu- 
tion, are not subspaces of H. 


See also: Algebraic Approach to Quantum Field Theory; 
Axiomatic Approach to Topological Quantum Field 
Theory; Axiomatic Quantum Field Theory; Boundary 
Conformal Field Theory; BRST Quantization; 
Perturbative Renormalization Theory and BRST; 
Quantum Fields with Indefinite Metric: Non-Trivial 
Models. 
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Introduction 


Let g be a Riemannian metric on a smooth compact 
manifold M of dimension m. We assume for the 
moment that the boundary of M is empty and 
postpone until later a discussion of the more general 
setting. If x=(x1,...,Xm) is a local system of 
coordinates on M, let 


gym g(a", a7) 


give the components of the metric tensor. Let D be 
an operator of Laplace type on a smooth vector 
bundle V over M. Adopt the Einstein convention 
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and sum over repeated indices. Relative to a local 
coordinate frame for V,D has the form 


D = —{ g'1dð7ð7 + Ata + B} 


where A^ and B are endomorphisms (i.e., matrices) 
of V. 

We assume that V is equipped with a positive- 
definite inner product and that D is self-adjoint. 
There is then a complete orthonormal basis {¢ġ;} for 
L^(V), where à; € C*(V) and Do; = A;ó;. The collec- 
tion [ó;, Aj] is called a discrete spectral resolution of 
D. For example, if D= —0 on the circle, then the 
discrete spectral resolution is 


le^ in "| 
ncz 


If we order the eigenvalues A; € A» € --- and repeat 
each eigenvalue according to multiplicity, then there 
is the following estimate due to Weyl: 


2/m 


An ~n as ^n — oo 
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We now suppose given a pair of vector bundles Vj; 
and V5 over M and a kth-order partial differential 
elliptic operator 


A: C*(V1) = C*(V3) 


Locally, we decompose 


A= X að! 


|I| <k 


where I = (44,...,i4) is a multi-index and where 


af = (0%)"... (ax) 


The a; are linear maps from V, to V2. The leading 
symbol of A is then defined by setting 


o1(A)(x,€) = (V—1)* $ ar(x)e! 


|I|=k 
where g = (£1 f ett iCal, and 
& = (f1,---,&m) 


are local fiber coordinates on the cotangent bundle. 
The leading symbol is an invariantly defined map 


oy: T M — End( V1, V2) 


For example, if V, = V; and if D is an operator of 
Laplace type, then the leading symbol is given by the 
metric tensor, that is, 


oci (D) = gi &&Id = |£|"Id 


If d is exterior differentiation, then the leading 
symbol is given by exterior multiplication, that is, 


ei (d)(£)u = V —1£ ^w 


The operator A is said to be elliptic if o1 (A) is an 
isomorphism from V, to V2 for any € Æ 0. If A is an 
elliptic partial differential operator, then 


index(A) :— dim ker(A) — dim coker(A) 
= dim ker(A* A) — dim ker(AA") 


is well defined. As the index vanishes if m is odd, we 
assume for the most part that 77 is even. 

If A. is a smooth one-parameter family of such 
operators, then index (A.) is independent of £. The 
index depends only on the homotopy class of the 
leading symbol of A within the class of invertible 
symbols; it does not depend on the underlying 
metric of the manifold and it does not depend on 
the fiber metrics chosen for V4 and V5. 

The Atiryah-Singer index theorem expresses the 
index as the integral of suitably chosen polynomials 
in the curvature tensor for the classical elliptic 
complexes and, more generally, in terms of 


cohomological information for general elliptic com- 
plexes. Further details appear later in the article. 

The primary focus here is on the complexes which 
are of Dirac type, that is, complexes where A is a 
first-order partial differential operator and where 
the associated second operators D;,:— A'A and 
D5:— AA* are of Laplace type. 

Here is a brief outline of this article. The classical 
elliptic complexes (de Rham, signature, spin, 
Dolbeault, Yang-Mills) are discussed first. Next 
the characteristic classes are introduced, followed by 
the relevant formula for the index of the classical 
elliptic complexes, manifolds with boundary, and 
the equivariant index. Index theory is an enormous 
topic and here only classical features are emphasized 
as a complete treatment is beyond the scope of a 
short expository note such as this one. As some 
guide to various applications in mathematical 
physics, the reader is referred to the Further Reading 
section. 


The Classical Elliptic Complexes 
The de Rham Complex 


Let A?M be the bundle of smooth p forms over M 
and let 


d : C* (A? M) — C*(AP*! M) 
and 
6: C*(A*? M) — C* (AP! M) 


be the exterior derivative and dually the interior 
derivative, respectively. We set 


A:—(d--6) on C*(AM) 


and the decompose A= @, A", where A^ is an 
operator of Laplace type on C* (A^ M). 

We have d*^—0. The de Rham cohomology 
groups are given by taking the quotient of the closed 
forms by the exact forms: 


=. — ker(d : C*(APM) — C* (AP M)) 
p * [5 P ————————————————————— 
iul bap; C*(A?-1 M) — C*(APM)) 


The Hodge-de Rham theorem identifies H^(M; R) 
with the kernel of the Laplacian 


ker(A") = H? (M: R) 


and with the topological cohomology groups. 

If £ is a cotangent vector, let e(£):w — €Aw be 
exterior multiplication. Let i(£) be the dual 
operator, interior multiplication. If [ej] is a local 


ortho-normal frame for TM, let e! =e" A--- A^ e, 
where [—(1 € ij <--- < ip € m). Then we have 


e(e e! = » ] E i x1 
e ^e it 74, > ] 
iti e [o ^- Ae if iy =1 
0 if i; >1 


Define a Clifford module structure on AM by 
TE == e(€) — E) 
If {e;} is a local orthonormal basis for TM, then 
Keale) + (e )y(e’) = —261d 


so the usual Clifford commutation rules are satisfied. 
Let V be the Levi-Civita connection on M. We may 
then expand 


d = e(e')V e, 6 = ie) Va 
d +6 = (ev, 
The de Rham complex is then defined by taking 
Aeven M -= QA" M, Aedd M TS gA t M 
d--8: C™( Atv" M) wad C> (Add M) 


The Signature Complex 


The signature complex arises from a different decom- 
position of the exterior algebra. Let Clif M be the 
Clifford algebra of T*M; this is the universal unital 
algebra generated by T*M subject to the Clifford 
commutation relations given above: 


Ei * €) + éz *& = —2g(£1, £2) - Id 
We suppose M is orientable and let 
orn = ey * ::: ej € Clif M 


be the orientation class. The map £ — ^(£) extends 
to a unital algebra homomorphism 


^: Clif M — End(AM) 


y(orn) defines an endomorphism of AM which is, 
modulo suitable sign conventions, the Hodge x 
operator. If 7: — 2k is even, then 


(d + 6) (orn) = —7(orn)(d + 6) 
Set 
Ə := (V—1)*y(orn) 
As O* — Id, we can decompose 


AM@®C=A*M@AM 
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where A*M are the +1 eigenspaces of O. The 
signature complex is then given by 


(d + 6) : CX (A* M) 3 C*(A- M) 


Twisted Signature Complex 


Let V be an auxiliary complex vector bundle over 
M which is equipped with a unitary connection V. 
We use the connection V" on V and the Levi-Civita 
connection. on TM to covariantly differentiate 
tensors of all types. The twisted signature complex 
is defined by setting 


(d -6)y 
= (yle) &Id)V,, : C*(A*MS V) C*(A M&V) 


Yang-Mills complex 


This complex in dimension 4 arises from yet another 
decomposition of the exterior algebra. We use the 
discussion in the previous section to decompose 


A^ M = A^* M 6 A^- M 
into the +1 eigenspaces of O. Let 
m : AM > AM 


be orthogonal projection. The Yang-Mills complex 
is the 3-term sequence 


d : C* (AM) — C*(A' M) 
and 
nd : C* (A! M) 5 C*(A^-M) 


We can wrap up this sequence to obtain an 
equivalent elliptic complex 


(d 4- 5) i CHASM) E C? (Aodd-* M) 


As with the signature complex, this complex can 
be twisted by taking coefficients in an auxiliary 
vector bundle V. It is crucial to the study of four- 
dimensional geometry using Yang-Mills theory. 


Dolbeault Complex 


Let z= (z1,...,z4) be a local system of holomorphic 
coordinates on a complex manifold M, where 
z; — xj + V—1y;. We define 


dz := dx'--v-1dy, dz :-dx'— V—1dy’ 
Of = 3 (OF — v—107), 07 2 $(8* + v—197) 


— 
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and decompose d =ð + ð, where 
0:—e(dz)8? and ð := e(dz’)ð7 


on the complexified exterior algebra. Let 6’ be the 
adjoint of 9 and 6” be the adjoint of ð. Let 


dz! :— dz" ^... A dz 


is even 


A even) .— Span{ dz’ } | 


A (0,0dd) — Span(dz Jj is odd 


The Dolbeault complex is then defined by 
(ð+ 6") C99 (A (even) yr = C99 (A608) yp 


This complex can be twisted by taking coefficients 
in a holomorphic bundle V over M. 


The Spin Complex 


Let M be orientable. Let Pso be the principal SO 
bundle of orthonormal frames for the tangent 
bundle. A spin structure s on M is a principal 
Spin bundle Ps, together with a double cover 
p:Psp — Pso which respects the usual double 
cover p:Spin — SO of the structure groups. 
Equivalently, a spin structure is a lifting of the 
transition functions. from SO to Spin which 
preserves the cocycle condition. One says that M 
Is spin if it admits a spin structure. 

A manifold is orientable if and only if the first 
Stiefel-Whitney class of M vanishes; an orientable 
manifold is spin if and only if the second Stiefel- 
Whitney class of M vanishes as well; these are 
Za-valued cohomology classes. Inequivalent spin 
structures are parametrized by the cohomology group 
H! (M; Z2) or, equivalently, by real-line bundles on M. 

The spin representation S of Spin defines an 
associated spin bundle SM=S(M,s). There is a 
natural Clifford action c of TM on SM. The Levi- 
Civita connection lifts to define the spin connec- 
tion on S and the Dirac operator is defined by 


A(s) := c(dx')Vax on C*(SM) 
Let m=2k and let O :=(V—1)*c(orn). Since 


c(O)^ =Id, one can decompose 
SM—S'MoSM 
as the direct sum of the half-spin bundles to obtain 
the spin complex: 
A(s) : C*(S* M) — C*(S M) 


As with the signature complex, the spin complex can 
be twisted by taking coefficients in an auxiliary vector 
bundle V. 


Relating the Classic Elliptic Complexes 


One has natural isomorphisms of virtual representa- 
tions of the spinor group: 


At =A =(ST=—S")@(ST+S) 
Agven _ Aodd = (-1y" (St -Fi 2 (S+ u S`) 


which show that the signature complex and de Rham 
complexes are the spin complexes with coefficients in 
the virtual bundles 


S'M+SM and (-1)"?^(S*M —S- M) 


respectively. If M is complex and spin, then the 
Dolbeault complex is the spin complex with coeffi- 
cients in the square root of the canonical bundle. 
One can consider complex spinors to define the 
group Spin‘(m). Any spin manifold admits a Spin‘ 
structure with trivial associated complex line bun- 
dle. Any complex manifold admits a Spin‘ structure 
with associated complex line bundle given by the 
canonical bundle. Thus, a complex manifold admits 
a Spin“ structure if and only if it is possible to take a 
square root of the canonical line bundle; inequiva- 
lent Spin structures are parametrized by inequivalent 
square roots. If M is orientable, then M admits a 
Spin structure if and only if the second Stiefel- 
Whitney class of M lifts from H^(M;Z;) to 
H^(M;Z); in the complex setting, this lifting is 
performed by the first Chern class. Inequivalent 
Spin? structures are parametrized by H^(M;Z) or, 
equivalently, by complex line bundles over M. 


Characteristic Classes 

The Euler Form 

Let V be the Levi-Civita connection on M. Let 
R(x, y) := VxWVy — VyVx— V Ixy] 


be the curvature operator. Let {e1,...,€m} be a local 
orthonormal frame for TM and let 


Ri = g(R(ei, eee, ei) 


give the components of the curvature relative to a 
local orthonormal frame. Let 


e :— gle A^-- ^e", e A. e) 


be the totally antisymmetric tensor; this is the sign 
of the permutation which sends i, — jp. Let 
m — 2m. The Euler form is given by setting 


i 


~ Qaim 


; £ 
NE Ria. R 


lm —1U5]m-1]m 


Let pj; :— Rikk; and 7 :— pj; be the Ricci tensor and the 
scalar curvature, respectively. Then, 


1 1 
6g = got and lad 7 


(7? — 4lp[^ + |RI] 


The Pontrjagin Forms 


Since R(x,y)— —R(y,x), we can regard R as a 
2-form-valued endomorphism of the tangent bundle. 
We define the Pontrjagin forms p; € C*(A*"M) by 
expanding 


1 
de( 1 8) =1+pit+p2-+-::: 
27 


These differential forms are closed and the corre- 
sponding cohomology classes 


P; = [pi] € H"(M; R) 


in the de Rham cohomology are independent of 
the particular Riemannian metric on M which was 
chosen. 

The A genus and the Hirzebruch L polynomial 
are expressed in terms of these classes using the 
splitting principle. Let A be a skew-symmetric 


matrix. One sets 
p(A) := det(I + A) 2 1 + p1(A) + p3(A) 4 --- 


As A is skew symmetric, it decomposes as the direct 
sum of 2 x 2 blocks of the form 


0 A 
—A Q 


p(A) = [ [{1 +3} 


We then have 


sO 
k 42 
pi(A) = sil AT, AS. s .) 
where s; is the ith symmetric function; 


pı = 2,5, 


and so forth. Let 
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As L; and A; are even symmetric functions of À, one 
can write L; — Li(p4(A),..., p,(A)). For example, 


À-—1-3pi-ssUpi-4p)-4--- 


Substituting (1/27)R for A then permits one to 
define the Hirzebruch polynomial L(R) and the A 
genus A(R). 


The Chern Forms 


Let V be a k-dimensional complex vector bundle 
over M. Let V be a Hermitian connection on V and 
let Q be the associated curvature endomorphism. 
The Chern forms c; € C*(A^M) are defined by 
expanding 


v—1 
d(I + G a) =1l+ca+o+-::: 
T 


As with the Hirzebruch polynomial and the Â genus, 
the Chern character and Todd genus are expressed 
in terms of the generating functions: 


Td(À) = ; 
and 

ch) = 55 
One has 


Td 2 1--Tds Tidy ---- 
—1-c-iac65(dto)t-- 

Ch = cho + chy + ch? +- 
=k+e,+4(cj—2c2)+-:- 


The Index Theorem 
The Gauss-Bonnet Theorem 


We return to the de Rham complex. Let 


x(M) = 5 (-1)" dim H?(M; R) 
p 


be the Euler-Poincaré characteristic; y(M) — 0 if m is 
odd. Let M have a simplicial structure with n(k) cells 
of degree k; n(0) is the number of vertices, (1) is the 
number of edges, 7(2) is the number of triangles, etc. 
Then 
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so the Euler-Poincaré characteristic is a combina- 
torial invariant. By the Hodge-de Rham theorem, 


index(d + 6) = dim ker( A***") 


= x(M) 


— dim ker(A°*“) 


The Chern-Gauss-Bonnet theorem expresses this 
invariant in terms of curvature 


=] Eg OX 
JM 


where Em is the Euler form given above. If one twists 
the de Rham complex to take coefficients in an 
auxiliary vector bundle V, then no new information 
results, since 


index{d + 6}, = y(M) - dim(V) 


The Hirzebruch Signature Theorem 


Let sign (M) be the index of the signature complex 
on a manifold of dimension 4k; the index vanishes 
in dimensions m = 2 mod 4. Let x be the Hodge 
duality operator. As «A? x^! — A”? x preserves the 
eigenspaces of the Laplacian. In particular, « induces 
an isomorphism 


x : H? (M; R) = ker( A") 
— H'"-P(M;R) = ker(A"'-^) 


which implements Poincaré duality. In dimension 
2k, x^ =Id. Decompose 
H^ (M; R) = H-*(M;R)oH-(M;R) 
into the +1 eigenspaces of x; these may be identified 
with ker(A**+*) acting on C*(A?5*M). As the 
contributions to the signature away from the middle 
dimension cancel, 
sign(M) = dim H?^*(M;R) — dim H?*- (M; R) 
As with the de Rham complex, there is a 
topological description of this invariant. If œ and 8 
are closed 2k forms, one sets 


(a, [B = / a ^ B 
JM 


One can use Stoke's theorem to see that this 
induces a symmetric bilinear form on the de 
Rham cohomology groups H**(M;R). Poincaré 
duality then shows that this symmetric bilinear 
form is nondegenerate, so this is a form of type 


(p,q); sign(M) is the signature of this quadratic 
form: 


sign(M) —q—p 


The Hirzebruch signature formula expresses sign 
(M) in terms of curvature; if L is the Hirzebruch 
polynomial described above and if m= 4k, then 


sign( M) =| Ly 
M 


Let V be an auxiliary coefficient bundle. Taking 
coefficients in V then yields the formula 


s J. 2 L; ^ ch;(V) 


4i+2j=m 


sign y ( 


The Index of the Yang-Mills Complex 


Let YMy be the Yang-Mills complex with coeffi- 
cients in an auxiliary vector bundle V, then the 
index can be evaluated using the formulas given 
above as 


index{ YMy} — 5 (dim(V)x(M) — sign(M, V)} 
=} [4 (dim V£4 — dim VL; — 4ch2(V)} 


The Index of the Dolbeault Complex 


If V is a holomorphic bundle over a complex 
manifold M, then 


index{(0+ 6"), = 


2H-2;j—m * 


/ Td;(M) ^ ch;( V) 
M 


The index of the untwisted Dolbeault complex is 
called the arithmetic genus and denoted by ag(M). 


The Index of the Spin Complex 


If M is a spin manifold and if Ay is the Dirac 
operator with coefficients in an auxiliary coefficient 


bundle, then 
index{Ay} = * J A;(M) ^ ch;(M) 
4i42j=m M 


The index of the spin complex is called the A genus 
and is denoted by A(M). If M is a Spin* manifold, 
the appropriate formula becomes 


J ^an ) ^ ch;(M 


where 0 — 5ci(L), L being the complex line bundle 
associated with the Spin‘ structure. 


index{A¢} M) A 


MP» m 


Properties 


The classic elliptic complexes defined above are 
multiplicative with respect to Cartesian product. 
Suppose that Mı and M» are Riemannian manifolds 
with the appropriate structures. For the signature 
complex, suppose M, and M^» are oriented; for the 
Dolbeault complex, suppose M; and M; are holo- 
morphic; for the spin complex, suppose M, and M; 
are spin. By taking the twisting coefficient bundle to 
be trivial in the interests of simplicity, one has 


x(Mi x M2) = x(Mi)x(M») 
sign(M, x M2) = sign(M, )sign( M2) 
ag(M, x M2) = "udi mim 
A(M, x Mz) = A(Mi)A(M2) 
These complexes behave well under finite coverings. 


Let F — M» — M; be a finite covering projection 
with |F| sheets. Then 


(M2) = |F|xX(M1) 
sign( M5) = |F|sign(M1) 
ag(M2) = |Flag(M;) 
A(M3) = |F|A(M;) 


The connected sum M,#M)b is defined by punching 
out small disks about points P; in M; and then 
joining along the spherical boundaries that remain. 
It is necessary, of course, to smooth out the resulting 
corners. Note that if M, and M; are complex 
manifolds, then M,;#Mb) is no longer a complex 
manifold in general. Since 


x(S") =2, sign(S") = 0, and A(S”) — 0 


the following additivity results follow from the 
integral formulas given above: 


x(Mi#M2) = x(Mi) + x( M5) - 2 
sign(Mi17*M5) = sign(M,) + sign( M5) 
A(Mi3£M3) = A(Mi) + A(M3) 


Examples and Applications 


Let $" be the standard sphere and let CP’ be the 
complex projective plane. One then has 


x(S^) = 2, sign($^) = 0 
y(S* x S*)=4, sign(S?^ x $^) = 0 
x(CP^) = 3, sign(CIP?) = 1 


In dimension 4, the Riemann-Roch formula yields 


ag(M*) = 1(x(M) + sign(M)} 
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This would yield ag(S*) = 5; since j is not an integer, 


this shows that S* does not admit a complex 
structure; a similar argument shows that $” does 
not admit a complex structure for n Æ 2,6, and it is 
not known whether $5 admits a holomorphic 


structure; it does admit an  almost-complex 
structure. 
If we set M=CP?#CP’, then 
ag( M) =}(3+3-2+1 +1) zi. 


and thus CP*#CP* does not admit a complex 
structure. These examples are typical of the use of 
the index theorem to prove the nonexistence of 
certain structures. 


The General Index Theorem 


Let S(T* M) be the sphere bundle of unit cotangent 
vectors and let D(T*M) be the disk bundle of 
cotangent vectors of length at most 1. Let 


P;CU(Vi)o C™(V2) 


be an elliptic pseudodifferential operator. The 
leading symbol p :— c; (P) induces a smooth map 


p : S(T' M) — End(Vi, V2). 


We form X(M) by gluing two copies of D(M) 
together along their common boundary S(M) and 
we define a bundle X(p, V4, V2) over X(M) by gluing 
Vi to V» over S(M) using the clutching function p. 
The Atiyah-Singer index theorem expresses the 
index of P in terms of cohomological data involving 
the Chern class of the symbol bundle and the 
characteristic classes of the tangent bundle of M. If 
(M) is given a suitable orientation, then 


index(P) = ». | ch;(X(p. Vi. V>)) ^ Td;(M) 
2344j=2m 7 X(M) 


It specializes to the results given above for 
the classical elliptic complexes. Conversely, by 
using K-theoretic methods, the index theorem in 
full generality can be derived from the special case 
of the twisted signature complex. 


Manifolds with Boundary 


If the boundary of M is nonempty, we must impose 
suitable boundary conditions. 


Local Boundary Conditions 


Choose local coordinates x — (x!,...,x") near the 


boundary of M so that x" is the geodesic distance to 
the boundary. On the boundary, we can decompose 
a differential form we C*(AM) in the form 
w=w1 + dx" Aw», where w, and w» are tangential 
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differential forms. Absolute and relative boundary 
conditions are defined by setting 


B4jw:—wo4 and B,w:= wilam 

Let (d 4-6), and (d 4-6), be the associated realiza- 
tions. These operators preserve the grading of the 
exterior algebra AM = A**"M @ A***M and define 
elliptic complexes 


(d-4-6), : C'* (AP M) — CHAM) 
(d--5), : CAM) C UM) 
We consider a collection 
J={1 <j <- < jp <m} 
of tangential indices and let 
dx! = dx" ^--- ^ dx" 


The associated absolute boundary conditions for the 
Laplacian are defined by 


B, (ójdx! + pdx” ^ dx!) 
= (uy|o dx) ® ( 050 | ou) dx 


If x is the Hodge operator, then one sets dually: 


~ 


B,(w) = B, (xw) 


Let A? and A? be the associated realizations of the 
Laplacian with these boundary conditions. The 
Hodge-de Rham theorem extends to this setting to 
yield isomorphisms 


ker(A’?) = H? (M; R) 


and 
ker(A?) = H’ (M, ðM; R) 


The Hodge * operator intertwines A^ and APP 
and implements the Poincaré duality isomorphism 


H? (M; R) = H"-P(M,OM;R). This also shows that 
index(d + ô), = X (—-1)" dim H^(M;R) = x(M) 
p 
and 
index(d +6), = X (—1)" dim H?(M, 2M; R) 
p 


= x(M, ðM) = x(M) — x(0M) 


Let £,, be the Euler form if m is even. We set 
Em — 0 if m is odd. Let L be the second fundamental 


form. Let A—(41,...,a4, 1) and B-—(bi,...,b, 1) 
be collections of distinct indices ranging from 1 to 
m — 1. Set 


] 
idi DBR — 1 — 2k)'vol(S"-1-24) 


A.B 
XE KR riasbib oL Ray, takbok bak 


X Dipati =e Lay TUE 


The Chern-Gauss-Bonnet theorem generalizes to 
this setting to yield 


x(M) = index(d + 6), 


JE AAA i dy 


For example, 


x (M?) -x V rix +2 | Lady] 
JM? OM? 


1 
x(M?) _ ge J, Reb + Lal: = Lab Lab }dy 


1 ) 
tT a e 2_ Alol? + IR dx 
] 
—Ó 3 Lu n ona L ID 
UE - TLa d bl 


LE bR arpe bat T: 21 abb Lec 
= 6LpLapL.. RE 4 LabLbcLac}dy 


The interior integral vanishes if m is odd. The 
boundary integral can be nonzero in any dimen- 
sions. Thus, in particular, the index of this elliptic 
complex can be nonzero even if m is odd; x(D") — 1 
for any m. The index of (d+6), is computed 
similarly. 


Spectral Boundary Conditions 


In contrast to the de Rham complex, there do not 
exist local boundary conditions for the signature, 
spin, and Dolbeault complexes. To simplify the 
discussion, we assume that the metric is the product 
near the boundary; there are appropriate compen- 
sating terms involving the second fundamental form 
in the more general setting. Let A:C*(Vj) — 
C*(V5) denote either the twisted signature or the 
twisted spin complexes; there are some additional 
difficulties for the Dolbeault complex. Near the 
boundary, we can express 
A= c (0; E Ar) 


m 


where Ar is a self-adjoint tangential operator of 
Dirac type on Vij|;4, and o is a unitary bundle 


isomorphism from Vi|54, to V2|5,4. Let (oj, Ai} be the 
discrete spectral resolution of Ar. One defines 


n(Ar,s) = 5 | sgn(Ag)|Agl À 
3:0 


as a measure of the spectral asymmetry of Ar. This 
is well defined for Re(s) >> 1 and has a meromorphic 
extension to the complex plane C. It turns out that 0 
is a regular value and one defines 


((Ar) :— 5 {n(Ar, s) + dim ker(Ar))|,.o 


The spectral boundary conditions can now be 
imposed. Let Il. be orthogonal projection in 
L^(Vi|;4) on the span of the eigensections of Ar 
corresponding to non-negative eigenvalues and let 
As be the associated realization defined by this 
boundary condition. 

One can use the Atiyah-Patodi-Singer index 
theorem to generalize the relations given above to 
this setting. Let f4 be the local integral given above 
that involves the Hirzebruch L polynomial for the 
signature complex or the A genus for the spin 
complex. One then has 


index(As) = (AT) + | bs 


There are suitable correction formulas involving 
integrals of polynomials in the second fundamental 
form and in the curvature tensor if the structures are 
not product near the boundary. 


Equivariant Problems 
The Classical Lefschetz Formula 


Let M be a compact Riemannian manifold without 
boundary. Let T be a smooth map from M to M. Then 
pullback T* induces an action on C*(A^M) which 
commutes with the exterior derivative d and hence an 
action on the de Rham cohomology groups H” (M; R). 
The Lefschetz number of T is then given by 


£(T) = X (-1)'tr(T' on H?(M;R)} 
p 


To illustrate the Lefschetz number, let M =I? be 
the two-dimensional torus. Let e!:—dx!, let 


e :— dx’, and let e? :- dx! ^ dx*. Then, 
H°®(T?;R) =1-R 
H'(T*:R) =e'-R+e-R 
H*(T^;R)-e".R 
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Let T(x1,x2) = (nix + 712x2, 721x1 + 722x2). Then, 


T= 

T*(e*) = nye! + nme 

T* (e?) = nie + ne^ 
T*(e") = (144722 — N12" )e'* 


and, consequently, the Lefschetz number becomes 


ET) = det(I — T*) 


—1- (nii +nn) + (1117122 » 115721) 


The classical Lefschetz fixed-point formula expresses 
£ in terms of data for the fixed-point set F(T) and is an 
example of the equivariant index theorem. One 
assumes that the fixed-point set of T consists of smooth 
submanifolds N;,...,N, and that the induced map 
dT, on the normal bundles of these manifolds is 
nondegenerate. This means that det (J — dT,) Æ 0, that 
is, that there are no infinitesimal normal directions 


which are left fixed. One then has 


£(T) = $ ,sign(det(I — dT,))x(Ni) 


The Lefschetz Formula for the Other Classical 
Elliptic Complexes 


Let T be an orientation-preserving isometry of M. 
When dealing with the spin complex, suppose that T 
preserves the spin structure; when dealing with the 
Dolbeault complex, suppose that T preserves the 
holomorphic structure. If 


As CN (Va) — C"1V3) 


is one of the classical elliptic complexes, then by 
assumption T* commutes with A and hence pre- 
serves the eigenspaces of the associated Laplacians. 
The Lefschetz number is defined by setting 

£La(T) :=tr(T* on ker(A*A)) 


— tr(T" on ker(AA")) 


Setting T — Id, one recovers the standard index. 

To simplify the discussion, we assume henceforth 
that T is an orientation-preserving isometry of M 
with only isolated fixed points. Let [01,...,0,,/;] be 
the rotation angles of dT at a fixed point x of T. Set 


A; :— cos(6;) + V—1 sin(6;) 


We take the sum over the isolated fixed points x and 
then the product over the rotation angles 1 < j < 
m/2 to express 
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0; 
£g ( T) = d. [Jf -co (2) ) 
| 1 7 
Lepin(T) = 2. Il-5 V —1 cse (2) | 
Loow(T) = 5 [ [t - 25^ 


In considering. the spin. complex, we assume T 
preserves the spin structure. This permits us to lift dT 
from SO(m) to Spin(m) and defines liftings of the 
rotation angles 0; from [0,27] to [0, 42] in such a way 
that the formula given above for the spin complex is 
well defined. In considering the Dolbeault complex, 
we assume that T preserves a complex structure, so the 
formula given above for the Dolbeault complex 
involving the complex eigenvalues A; is well defined. 
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Introduction 


Given 1 € p < n, it was shown by Sobolev that there 
exists a constant K » 0 such that, for any u€ 

a(R”), the space of smooth functions with 
compact support in R”, 


| /p* | /p 
(| jul’ dx) < K(f [Vu]? dx ) [1] 
JR" R” 
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where Vu is the gradient of u and p* — np/(n — p). It 
is easily seen that p* in [1] is critical in the following 
sense. Let ||-|| stand for the L’-norm. For u € 
Co (R"), and A > 0, let also u) be the function given 
by (x) =«(Ax). For p and q two real numbers, 


| Vu], =A?) || Weel], 


lll, =X" 


iu]. 


Letting À — 0 and A — +æ, it follows that an 
inequality like ||&||, < K|| Vl], holds true for all v 
(in particular for the z,'s) only when g=p*. To 


prove [1], the approach of Sobolev was based on the 
straightforward representation formula 


(n/2) x* zs 
an) se ee rns z uly) ay 


where I is the Gamma function, and on an 
n-dimensional version of a theorem of Hardy- 
Littlewood concerning fractional integrals that we 
apply to the right-hand side of the above representa- 
tion formula. More direct arguments were later 
discovered in independent works by Gagliardo and 
Nirenberg. In particular, the explicit inequality 


ian (n—1)/n 14 1/n 
ul "ds <x / D,u dx) 
( " | ) ;T( » | 
ir 
«5 | IVuld 2j 
> R” 


was proved to hold, where D, is the partial 
derivative D, =0/Ox,. Inequality [2] is of the form 
[1] when p=1, since 1*=n/(n—1). By geometric 
measure theory, and the coarea formula, it can be 
expressed as an isoperimetric type inequality. 

There have been several symbols and several 
definitions for Sobolev spaces. Before they became 
generally associated with the name of Sobolev, they 
were sometimes referred to by other names, for 
instance, as “Beppo Levi spaces.” We often find two 
definitions and two notations in the literature. For €) 
a domain in R", p > 1 real, and u of class C" in Q, 
we let 


L/p 


llus | 2, Dul i3] 


0c |o | Xm 


when the right-hand side makes sense, where || - | 
the L?-norm, a=(aj,...,@,) is a mmiri- Salan. 
la|— $705 atid D* — Dy" s+» D» We define 


H"'P(()) =the completion of 
{u € C" (Q) s.t. ||u]|,, , < +00} 
with respect to the norm ||: || 


) = {u € L’ (N) s.t. D^u € L?(Q) 
for all 0 € |a| € m} 


m.p 
wr 


where D® is the weak (or distributional) partial 
derivative of u with respect to the multi-index a. Both 
H™P (Q) and W"^P(Q) are Banach spaces (and even 
Hilbert when p= 2). It is easily seen that H"^P(€) C 
W"'"P(Q), but we had to wait for the work of Meyers 
and Serrin to realize that H"^?(Q) — W™P (Q). The 
spaces H™? (Q), also denoted W"^^(Q), are referred to 
as Sobolev spaces. The spaces Hj””(Q), also denoted 
Wr"?(Q), are defined as the a of C; (Q) in 
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H”™?(Q), where C(Q) is the space of smooth 
functions with compact support in Q. 

ree reed [1] states that the Sobolev space 
Hy PR”) is naturally embedded in the Lebesgue 
space L"(R"), a particular case of what we now 
refer to as Sobolev embeddings. 


Sobolev Inequalities and the Sobolev 
Embedding Theorem in Its First Part 


Let m be an integer and let p> 1 be real. The 
Sobolev space H"^P(IR"), also denoted by W"^P(R"), 
is defined by in one of the two equivalent ways: 


H”? (R”) =the completion of 
{u € C" (R7) s.t. ||u||,, ,, < +00} 


with respect to the norm ||- ||,, , 


Or 


H'"^P(R") — (u € L'(R")s.t. D^u € L'(R") 


for all 0 € |a| € m} 


where D^ is the weak (or distributional) partial 
derivative of u with respect to the multi-index a, and 
IE Ims. is as in [3]. The Sobolev space (H"^P(R"), 
| * |j, 5) is a Banach space, and even a Hilbert space 
when p — 2. The space is reflexive when p > 1, and 
we also have that H”?(R”)=H,” PIR”), where 
Ho PR") is defined as the diste: of C&(R") in 
H™P(R”). What we usually refer to as the first part 
of Sobolev inequalities can be expressed as follows. 


Sobolev embeddings (Part I). For p, q two real 
numbers with 1 < q < p, and k, m two integers with 
0 € m « k, if 1/p—1/q — (k — m)/n, then "ini 
H"^?. and there exists K » 0 such that ||u||,, 
K]ul, a for all u € H^, 


The Sobolev theorem in its first part states that 
the above Sobolev embeddings (resp. inequalities) 
hold true for the Euclidean space. A particular case 
of interest is when k= 1. In this case, we get, as in 
the introduction, that for any 1 < p < n, H? (R”) C 
LP (R") where p* =np/(n — p). The embedding for 
the Euclidean space reduces to the Sobolev inequal- 
ity [1]. An important remark is that there is a 
hierarchy for Sobolev embeddings. In particular, 
that if Hb! c L"/-U. 1* =n/(n — 1), then all the 
other embeddings H^? c H"^? hold true. Thanks to 
this remark, the Sobolev embedding theorem for 
Euclidean space easily follows from an inequality 
like [2]. The hierarchy for Sobolev embeddings is an 
easy consequence of Holder's inequalities when 
k=1, and of Holder’s inequalities together with 
Kato's inequality when k > 1. 
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There are several extensions of Sobolev inequal- 
ities in the literature. Famous extensions were 
discovered by Gagliardo and Nirenberg. The Nash 
inequality, which reads as 


(n4-2) /n 4 /n 
(Los) "ena 
n ; R" 


x] |Vul? dx [4] 
R" 

for all u € H^?(R"), is one of the Gagliardo- 
Nirenberg's inequalities. The Nash inequality easily 
follows from [1] when p — 2 and Holder’s inequal- 
ity. There are also extensions of Sobolev spaces, for 
instance, spaces of BV-functions or Orlicz-Sobolev 
spaces. 


The Sobolev Embedding Theorem in Its 
Second Part 


For m integer, let CZ(R") be the space of functions 
of class C" in R" for which the norm 


lllo = Š, sup |D^u(x)| 


0c |ao| Xm xcR" 


is finite. What we usually refer to as the second part 
of Sobolev inequalities can be expressed as follows. 


Sobolev embeddings (Part IJ). For q7 1 a real 
number, and k, m two integers with 0 < m < k, if 
1/q — (k —m)/n « 0, then H^? c C7, and there 
exists K » 0 such that ||W|o, < K||u|, , for all 
u € H*®4, 


The Sobolev theorem in its second part states that 
the above Sobolev embeddings (resp. inequalities) 
hold true for the Euclidean space. Refinements were 
then obtained by Morrey with embeddings in 
Holder spaces. Let, for instance, C^^(R") be the 
Holder space of continuous functions in R" for 
which the norm 


— u(x)| 


o 


Iu(y) 
||| Coa = sup |u(x)| + sup 
xcR" | 


x#y _ x| 


is finite. For k= 1,77 — 0, and q > 1 such that 1/4 — 
1/n < 0, the embedding H'?(R") C C}(R”) can be 
refined into an embedding like H^?(R") c C9 ^(R"), 
where a € (0, 1) is such that 1/q — (1 — a)/n < 0. 


The Case of Domains and the Kondrakov 
Theorem 


The Sobolev embeddings in their first and second 
parts extend to regular domains Q. A typical 
condition is that €) satisfies a cone property. When 


Q is bounded, and thus of finite volume, an 
embedding like H'?(Q) C LP (Q) implies that we 
also have that H'?(Q) C L4(Q) for all 1 <q € p*. 
The Kondrakov theorem states that such embed- 
dings are all compact, unless q — p*, in the sense that 
bounded sequences of functions in H^? possess 
converging subsequences in L4. 

For p > 1 real, the Sobolev embedding theorem in 
its first part provides embeddings of H'^ into 
Lebesgue spaces when p< n, while the Sobolev 
embedding theorem in its second part provides 
embeddings of H'? into Holder spaces when p > n. 
For p=n, it is false that H^" can be embedded 
into L*. However, when Q is bounded, we can 
prove that exp (u) € L'(Q) when u € Ho"), and 
that 


f exp(u) dx < K exptyilluli,) 


where 44, K > 0 are independent of u. We also have 
that 


J exp(ulu" 7D) dx < K 
0 


for all u € HY” (N) such that || Vul, < 1, where p, 
K > 0 are independent of u. Such inequalities are often 
referred to as Moser-Trüdinger type inequalities. 


The Case of Riemannian Manifolds 


Riemannian manifolds are natural extensions of 
Euclidean space. For (M, g) a Riemannian manifold, 
m integer, and p > 1 real, we define the Sobolev 
space H"^P(M) by 
H"'? (M) =the completion of 
{u E€ C" (M) s.t. [ul], 


with respect to the norm ||- ||, 


< +00} 


where ||ull,, p = »7/-o l| Vu, V'u is the ith covari- 
ant derivative of u, and ||-||, is the L?-norm 
in (M, g). A notation like || Vul], stands for the 
L?-norm of the pointwise norm |V'u| of V'u. Sobolev 
spaces on manifolds are Banach spaces, even Hilbert 
when p — 2, and they are reflexive when p > 1. They 
do not depend on the metric when M is compact. 
For compact Riemannian manifolds, everything 
works as for bounded domains. The Sobolev 
embeddings in their first and second parts remain 
valid. The Kondrakov theorem also remains valid. 
However, since constant functions are in Sobolev 
spaces when the manifold is compact, the L?-norm 
of u in the H'?-norm of u should be added to the 
right-hand side in inequalities like [1]. More 
precisely, if (M,g) is a compact Riemannian 


manifold of dimension z, and 1 € p « n, then the 
inequality for the embedding HP'^(M) C LP (M) 
reads as: there exists K » 0 such that for any 
u € H^*(M), 


p/p* . 
(| u” dve) < K(f [Vu]? dvg «] ila) [5] 
M | JM M 


where dv, is the Riemannian volume element with 
respect to g. When (M, g) is no longer compact, the 
Sobolev embedding theorem might become false. A 
nontrivial key observation is that a Sobolev inequal- 
ity like [5] on a complete manifold (M, g) implies the 
existence of a uniform (with respect to the center) 
lower bound for the volume of balls of radius 1. It 
follows that for any n > 2, there exist complete 
Riemannian z-manifolds (M,g) for which, for any 
p € [1,2), H^? (M) ¢ L" (M). Possible examples are 
warped products of the real line R and the 
(n — 1)sphere $"^!'. When the Ricci curvature is 
bounded from below, the condition that there is a 
uniform (with respect to the center) lower bound for 
the volume of balls of radius 1 is necessary and 
sufficient in order to get that the Sobolev embed- 
dings are valid. 


Isoperimetric and Euclidean 
Type Inequalities 


Let (M,g) be a complete Riemannian z-manifold. 
Euclidean type inequalities are said to hold on (M, g) 
if there exists K » 0 such that for any 1 < p <n, 
and any u € H'P(M), 


. 1 /p* 1/p 
(/ |u|? dve) < K(f Vu [6] 
M M 


where p* —np/n — p. As for the Euclidean space, if 
the above inequality holds for some po, then it 
holds, with distinct K, for all po Xp «n. In 
particular, if the inequality holds for p — 1, it holds 
for all p's. The inequality when p — 1 was shown to 
be true by Hoffman and Spruck when the manifold 
is simply connected of nonpositive sectional curva- 
ture. Such manifolds are referred to as Cartan- 
Hadamard manifolds. The inequality when p — 2. is 
related to the nonparabolicity of the manifold, 
namely the existence of a minimal Green's function, 
and to the behavior of the minimal Green's function. 

By geometric measure theory and the coarea 
formula, [6] when p=1 is equivalent to the 
isoperimetric inequality 


1 


Areag(O2) > C 


Vol, (Q2) !// [7] 
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where C » 0, is a smooth bounded domain in 
M, Area,(0€)) is the volume of 0€) for the metric 
induced by g, and Vol,(Q) is the volume of Q with 
respect to g. Moreover, the constants C and K 
(for p — 1) are the same in the sense that if [6] for 
p —1 holds with K, then [7] holds with C= K, and 
if [7] holds with C, then [6] for p —1 holds with 
K=C, 

The sharp constant for the isoperimetric inequal- 
ity [7] in Euclidean space is known. When n= 2 its 
value is C(2)=1/(47) and the sharp isoperimetric 
inequality is the well-known inequality L^ > 47A, 
where A is the volume of a smooth bounded domain 
in R, and L is the length of its boundary. For 
arbitrary n, the sharp constant C(m) for the isoperi- 
metric inequality is given by 


c == (Z^ ) 8 


WH \Wy-1 


where w,_1 is the volume of the unit (n — 1)-sphere. 
Moreover, still for the Euclidean space, equality 
holds in the sharp isoperimetric inequality if and 
only if Q is a ball. A famous conjecture concerning 
sharp isoperimetric inequalities, often referred to as 
the Cartan-Hadamard conjecture, is that the sharp 
isoperimetric inequality holds on Cartan-Hadamard 
manifolds. Thanks to works by Croke, Kleiner, and 
Weil, the conjecture is known to be true in 
dimensions 2, 3, and 4. From the Bishop-Gromov 
comparison theorem, we also get that the only 
complete manifold of non-negative Ricci curvature 
for which the sharp isoperimetric inequality holds is 
the Euclidean space itself. 

The sharp constants K — K(z, p) for [6] when p » 1 
have been computed in Euclidean space by Aubin, 
Rodemich, and Talenti. The extremal functions were 
also computed, where, by definition, an extremal 
function is a function which realizes the case of 
equality in the inequality. We get that 


nlp —1)\ @-DiP 
— 
" I'(n 4- 1) us 
Cans +1= ——! "1 


where, as above, I’ is the gamma function. More- 
over, 4 is an extremal function for the sharp 
inequality in Euclidean space if and only if, up to a 
scale factor, 


(n—p)/p 


u(x) = | ———"—____ [10] 
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for some pp >0O, and a €R”. When p=2, the 
functions u in [10] are both the only extremal 
functions for the sharp Sobolev inequality in Euclidean 
space, and the only positive solutions of the equation 
Au =u”! in R”, where A= —7; D? is the Laplace- 
Beltrami operator (the usual Laplacian with a minus 
sign in front of it). Sharp constants are also known for 
several of the Gagliardo-Nirenberg inequalities in 
Euclidean space. The sharp constant for the Nash 
inequality in Euclidean space was computed by Carlen 
and Loss. If the sharp isoperimetric inequality holds on 
a complete Riemannian z-manifold, then the sharp 
inequalities [6] hold for all 1 € p < n. 


Sharp Inequalities on Compact 
Riemannian Manifolds 


The study of sharp Sobolev inequalities on compact 
manifolds if often referred to as the AB program for 
Sobolev inequalities. For (M,g) a compact Rieman- 
nian z-manifold, and 1 < p < n, [5] can be rewritten 
in two different forms: 


; 1/p* 1/p 
(| ||" dv, ) <a( Vuldv,) 
JM M 
1/p 
+B( / Pave) [11] 
M 


and 


+B J dv, [12] 
M 


where A, B, A', B' are positive constants independent 
of u. An easy remark is that if [12] holds with 
constants A’ and B’, then [11] holds with A = (A/)? 
and B=(B’)'/?, The sharp first (resp. second) 
constants in [11] and [12] are defined as the lowest 
possible values for A and A' (resp. for B and B') in 
[11] and [12]. The sharp first constants are 
independent of the manifold and are given by 
A' — AP — K(n, pf, where K(n,p) is as in [9]. The 
sharp second constants depend on the manifold 
and are given by B'— B? = V;?/", where V, is the 
volume of (M,g). A typical question in the AB 
program is to know whether or not we can take A 
or B to be the sharp constants in [11] and, similarly, 
whether or not we can take A' or B' to be the sharp 
constants in [12]. Another typical question in the AB 
program is whether or not there are nonzero 


extremal functions for the saturated form of the 
sharp inequalities when they are valid. Concerning 
the B-part of the program, the sharp inequality [11] 
with B= V; "/" is true on any manifold, and constant 
functions are extremal functions. On the other hand, 
it can be proved that the stronger [12] with 
B'— V;"" is always false when p »2, whatever 
the manifold. Concerning the A-part of the 
AB-program, Hebey and Vaugon proved that the 
sharp inequality [12] with A'— K(z,2)* is true on 
any manifold. In other words, for any compact 
Riemannian manifold (M,g) of dimension n > 3, 
there exists B’ > 0 such that, for any u € H'?(M), 


2/2" 
(| jul? dve ) <K(n,2)* | IVu| dv, 
M | M 
+B / lu dv, [13] 
M 


We then get the saturated form of [13] by taking 
B' = B'(g) to be the lowest possible B’ in [13]. In 
general, when p Z2, we can prove that the sharp 
inequality [11] with A=K(n,p) is true on any 
manifold, and that there are nonzero extremal 
functions for the saturated form of the sharp inequal- 
itr. On the other hand, the stronger [12] with 
A' = K(n, p)" when p > 2 is false when the curvature 
is positive, but true when the curvature is negative. 
The p — 2 case in the A-part of the AB program is of 
importance for its connection with the Yamabe 
problem. The p=1 case in the A-part of the AB 
program is of importance for its connection with the 
isoperimetric inequality. The AB program has also 
been considered for Gagliardo-Nirenberg inequal- 
ities, including the Nash inequality, and Sobolev- 
Poincaré inequalities on compact manifolds. 
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Introduction 


Infinite-dimensional Hamiltonian systems arise in 
many areas in pure and applied mathematics and in 
mathematical physics. These are partial differential 
equations (PDEs) which can be written as evolution 
equations (dynamical systems) in the form 


F = (F, H} 


where H is the Hamiltonian (“energy”) and {.,.} is a 
Poisson bracket on an infinite-dimensional phase space, 
called Poisson manifold. Unlike finite-dimensional 
Hamiltonian systems, which are ordinary differential 
evolution equations on finite-dimensional phase spaces, 
for which general existence and uniqueness theorems 
for solutions exist, this is not the case for PDEs. There 
are no general existence and uniqueness theorems for 
solutions of infinite-dimensional Hamiltonian systems. 
These have to be established case by case. This article 
gives only a broad mathematical framework of infinite- 
dimensional Hamiltonian systems. Precise definitions 
are presented and the concept is illustrated through 
physical examples. 


Hamilton’s Equations on Poisson 
Manifolds 


A Poisson manifold is a manifold P (in general 
infinite dimensional) equipped with a bilinear 
operation {.,.}, called Poisson bracket, on the 
space C*(P) of smooth functions on P such that: 


1. (C9*(P),(.,4]) 1s a Lie algebra, that is, ka]: C? 
(P) x C*(P) — C*(P) is bilinear, skew-symmetric 
and satisfies the Jacobi identity ((F, G}, H] + 
{{H, F},G}+{{G,H},F}=0 for all F,G,H € 
C* (P) and 

2. {.,.} satisfies the Leibniz rule, that is, {.,.} 
is a derivation in each factor: (F- G, H} =F - 
(G, H} + G G- {F, H}, forall F, G, H € C*(P). 


The notion of Poisson manifolds was rediscovered 
many times under different names, starting with Lie, 
Dirac, Pauli, and others. The name Poisson manifold 
was coined by Lichnerowicz. 

For any H € C*(P), the Hamiltonian vector field 
Xy is defined by 


Xy(F) = {F, H}, Fe C%(P) 


It follows from (2) that, indeed, Xj defines a 
derivation on C*(P), hence a vector field on P. 
Hamilton's equations of motion for a function F € 
C*(P) with Hamiltonian H (energy function) are 
then defined by the flow (integral curves) of the 
vector field Xj, that is, 


F = Xu(F) = {F, H) (1 


where the overdot implies differentiation with 
respect to time. F is then called a Hamiltonian 
system on P with energy (Hamiltonian function) H. 


Examples of Poisson Manifolds and 
Hamilton’s Equations 


Finite-Dimensional Classical Mechanics 


For finite-dimensional classical mechanics, we take 
P=R™ and coordinates (q!,..., q^, p1,..., Pn) 
with the standard Poisson bracket for any two 
functions F(q', pi), H(q', pi) given by 


xx B 


“.OFOH OH OF 
iF, H} = 2. i i 
1 ĉpiðq' pi ðq 


Then the classical Hamilton’s equations are 


PRF OH 
7 = {q', Be 

p 

aH [3] 
Pi = lbi. Hj E3 ðq 


i=1,...,n. This finite-dimensional Hamiltonian 
system is a system of ordinary differential equations 
for which there are well-known existence and 
uniqueness theorems, that is, it has locally unique 
smooth solutions, depending smoothly on the initial 
conditions. 


Example: harmonic oscillator As a concrete exam- 
ple, consider the harmonic oscillator: here P = R* and 
the Hamiltonian (energy) is H(q,p)— +(q? + p?) 
Then Hamilton’s equations are 


qb, p=-q |4] 


Infinite-Dimensional Classical Field Theory 


Let V be a Banach space and V* its dual space 
with respect to a pairing (.,.): V x V* — R (Le., 
(.,.) is a symmetric, bilinear, and nondegenerate 
function). On P=V x V*, the canonical Poisson 
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bracket for F,H € C*(P),i € V, and «€ V* is 
given by 


6F 6H óH óF 
{F, H- (5.5) (Ss) [5] 
where the functional derivatives 6F/6x € V, 6F/6q € V* 
are the *duals" under the pairing (.,.) of the partial 


gradients D4F(z) € V*, D;F(o) € V** — V. The corre- 
sponding Hamilton's equations are 


* óH 
y^ = lo, Hj au 

T 

ôH [6] 
T= Ur, H} = 6e 


As a special case in finite dimensions, if V ~ R” so 
V* ~ R” and P — V x V* ~ R”, and the pairing is 
the standard inner product in R”, then the Poisson 
bracket [5] and Hamilton's equations [6] are 
identical with [2] and [3], respectively. 


Example: wave equations As a concrete example, 
consider the wave equations. Let V — C*(R?) and 
V*—Den(R?) (densities) and the L? pairing 
(p, 7) = f v(x)r(x) dx. Take the Hamiltonian to be 


Hte, m)= | (pr? «Vive + Fe) 


where F is some function on V. Then Hamilton's 
equations [6] become 


pat, *-Vp-F(p) [7] 


where the prime denotes differentiation with respect 
to o, which imply the wave equation 


9 
ar V^e- F'(p) [8] 


Different choices of F give different wave equations, 
for example, for F = 0 we get the linear wave equation 


Oy 
om — 
for F — (1/2)mg, we get the Klein-Gordon equation 


V^o 


So, these wave equations and the Klein-Gordon 
equation are infinite-dimensional Hamiltonian sys- 
tems on P — C*(R?) x Den (R3). 


Cotangent Bundles 


The finite-dimensional examples of Poissson brackets 
[2] and Hamilton's equations [3] and the infinite- 
dimensional examples [5] and [6] are the local versions 
of the general case where P — T*O is the cotangent 


bundle (phase space) of a manifold O (configuration 
space). If O in an n-dimensional manifold, then T*Ọ is 
a 2n-Poisson manifold locally isomorphic to R^" 
whose Poisson bracket is locally given by [2] and 
Hamilton's equations are locally given by [3]. If O is 
an infinite-dimensional Banach manifold, then T* Q is 
a Poisson manifold locally isomorphic to V x V* 
whose Poisson bracket is given by [5] and Hamilton's 
equations are locally given by [6]. 


Symplectic Manifolds 


All the examples above are special cases of symplectic 
manifolds (P,w). This means that P is equipped with 
a symplectic structure w which is a closed (dw — 0), 
(weakly) nondegenerate 2-form on the manifold P. 
Then, for any H € C*(P), the corresponding Hamil- 
tonian vector field Xy is defined by dH —u(Xpy, .) 
and the canonical Poisson bracket is given by 


{F,H} =w(Xr, Xu), F,HeC*(P) [9] 


For example, on R^" the canonical symplectic 
structure w is given by w= 7 ,dp;^dq' = d6, 
where 0— 377 ,pi^dq'. The same formula for w 
holds locally in T*O for any finite-dimensional O 
(Darboux's lemma). For the infinite-dimensional 
example P= V x V*, the symplectic form w is given 
by wW((p1, 71 ); (2; 72))— (P15 72) T (v2, 71). Again, 
these two formulas for w are identical if V — R". 


Remarks 


(1) If P is a finite-dimensional symplectic manifold, 
then P is even dimensional. 

(ii) If the Poisson bracket {.,.} is nondegenerate, 
then {.,.} comes form a symplectic form w, that 
is, {.,.} is given by [9]. 


The Lie-Poisson Bracket 


Not all Poisson brackets are of the from given in the 
above examples [2], [5], and [9], that is, not all 
Poisson manifolds are symplectic manifolds. An 
important class of Poisson bracket is the so-called 
Lie-Poisson bracket. It is defined on the dual of any 
Lie algebra. Let G be a Lie group with Lie algebra 
g= T,G ~ {left-invariant vector fields on G} and let 
|. ,.] denote the Lie bracket (commutator) on q. Let 
q* be the dual of a g with respect to a pairing 
(.,.):q* x g — R. Then, for any F,H € C*(gq*) and 
u € q*, the Lie-Poisson bracket is defined by 


{F, H} (u) =u, ER [10] 


where óF/óju,6H/óu € q are the “duals” of the 
gradients DF(u), DH(u) € q** ~ q under the pairing 
(.,.). Note that the Lie-Poisson bracket is degen- 
erate in general, for example, for G —SO(3) the 
vector space q* is three dimensional, so the Poisson 
bracket [10] cannot come from a symplectic 
structure. This Lie-Poisson bracket can also be 
obtained in a different way by taking the canonical 
Poisson bracket on T*G (locally given by [2] and [5] 
and then restrict it to the fiber at the identity 
T; G =q". In this sense, the Lie-Poisson bracket [10] 
is induced from the canonical Poisson bracket 
on T*G. It is induced by the symmetry of left- 
multiplication, as discussed in the next section. 


Example: rigid body A concrete example of the 
Lie-Poisson bracket is given by the rigid body. Here 
G — SO(3) is the configuration space of a free rigid 
body. Identifying the Lie algebra ($0(3),[.,.]) with 
(R), x ), where x is the vector product on R? and 
q* —80(3)' ~ R?, the Lie-Poisson bracket translates 
into 


{F, H}(m) = —m- (VF x VH) [11] 
For any F € C*($0(3)'), we have 


dF 


= —m- (VF x VH) = VF: (m x VH) 


hence m=m x VH. With the Hamiltonian 


mı = nd, nnm; m» i zH 17137714 
hls i EVE 
: h-b 
s LL mMm 


These are Euler’s equations for the free rigid body. 


Reduction by Symmetries 


The examples discussed so far are all canonical 
examples of Poisson brackets, defined either on a 
symplectic manifold (P,w) or T*Q, or on the dual of 
a Lie algebra q*. Different, noncanonical Poisson 
brackets can arise from symmetries. Assume that a 
Lie group G is acting in a Hamiltonian way on the 
Poisson manifold (P,{..}). This means that we have 
a smooth map y:G x P — P:y(g,p)=g-p such 
that the induced maps o,—9(g,.:P — P are 
canonical transformations, for each g € G. In terms 
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of Poisson manifolds, a canonical transformation is 
a smooth map that preserves the Poisson bracket. 
So, the action of G on P is a Hamiltonian action if 
p AF, H) = (o;F,v;H] for all F,H € C*(P), g € G. 
For any £ € q, the canonical transformations expe) 
generate a Hamiltonian vector field £y on P and a 
momentum map /:P — q* given by J(x)(£) ^ F(x), 
which is Ad" equivariant. 

If a Hamiltonian system Xy is invariant under a 
Lie group action, that is, H(y,(x)) = H(x), then we 
obtain a reduced Hamiltonian system on a reduced 
phace space (reduced Poisson manifold). We recall 
the Marsden-Weinstein reduction theorem: 


Reduction Theorem For a Hamiltonian action of 
a Lie group G on a Poisson manifold (P,{.,.}), 
there is an equivariant momentum map ]:P — q*, 
and for every regular u € q* the reduced phase 
space P, =] (u)/G, carries an induced Poisson 
structure {.,.},, (G, the isotropy group). Any 
G-invariant Hamiltonian H on P defines a 
Hamiltonian H, on the reduced phase space P, 
and the integral curves of the vector field Xy 
project onto integral curves of the induced vector 
field Xy, on the reduced space P,. 


Example: rigid body The rigid body discussed 
above can be viewed as an example of this 
reduction theorem. If P — T*G and G is acting on 
T*G by the cotangent lift of the left-translation 
l,:G — G,l,(b)—gh, then the momentum map 
J:T'G  g' is given by J(og) — T; Re(og) and the 
reduced phase space (T*G),—] !(u)/G, is iso- 
morphic to the coadjoint orbit O, through p € q*. 
Each coadjoint orbit ©, carries a natural symplec- 
tic structure w, and in this case, the reduced Lie- 
Poisson bracket {.,.},, on the coadjoint orbit ©, is 
induced by the symplectic form w, on O, as in [9]. 
Furthermore, T*G/G ~ q*, and the induced Pois- 
son bracket {.,.},, on O, is identical with the Lie- 
Poisson bracket restricted to the coadjoint orbit 
O, C q*. For the rigid body this construction is 
applied to G — SO(3). 


We now discuss some infinite-dimensional exam- 
ples of reduced Hamiltonian systems. 


Infinite-Dimensional Lie Groups 


A general theory of infinite-dimensional Lie groups 
is hardly developed. Even Bourbaki only develops a 
theory of infinite-dimensional manifolds, but all of 
the important theorems about Lie groups are stated 
for finite-dimensional ones. 
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An infinite-dimensional Lie group G is a group 
and an infinite-dimensional manifold with smooth 
group operations 


m:GxG—-G, m(g,b)=g-h, C^ —— [12] 


i:G3G, ig-g', C [13] 


Such a Lie group G is locally diffeomorphic to an 
infinite-dimensional vector space. This can be a 
Banach space whose topology is given by a norm 
\|-||, a Hilbert space whose topology is given by an 
inner product (.,.), or a Frechet space whose 
topology is given by a metric but not by a norm. 
Depending on the choice of the topology on G, the 
Banach, Hilbert, or Frechet Lie groups, respectively, 
can be treated. 

The Lie algebra à of QG is defined as 
q = (left-invariant vector fields on G} ~ TG, where 
the isomorphism is given (as in finite dimensions) by 


€ € T.Gr X*(g) = TL&(£) [14] 


and the Lie bracket on q is induced by the Lie bracket 
of left-invariant vector fields [£€,7]=[X‘, X"](e), 
En € 8. 

These definitions in infinite dimensions are iden- 
tical with the definitions in finite dimensions. The 
big difference although is that infinite-dimensional 
manifolds, hence Lie groups, are not locally com- 
pact. For Frechet Lie groups, one has the additional 
nontrivial difficulty of defining the differentiability 
of functions defined on a Frechet space. Hence, the 
very definition of a Frechet manifold is not 
canonical. This problem does not arise for Banach 
and Hilbert Lie groups; the differential calculus 
extends in a straightforward manner from R” to 
Banach and Hilbert spaces, but not to Frechet 
spaces. 


Finite- versus Infinite-Dimensional 
Lie Groups 


The lack of local compactness of infinite-dimensional 
Lie groups causes some deficiencies of the Lie theory 
in infinite dimensions. Some classical results in finite 
dimensions are summarized below, which are not 
true in general in infinite dimensions: 


1. The exponential map exp:q — G is defined as 
follows: To each £ € q we assign the correspond- 
ing left-invariant vector field X^ defined by [14]. 
We take the flow y‘(t) of X* and define 
exp(€)=y5(1). The exponential map is a local 
diffeomorphism from a neighborhood of zero in q 
onto a neighborhood of the identity in G; hence, 


exp defines canonical coordinates on the Lie 
group G. This is not true in infinite dimensions. 

2. If fi,fo:G4 > G2 are smooth Lie group 
homomorphisms (i.e., file - 4) = fi(g) - filh), i= 1,2) 
with T.f; = Tef, then locally f; —f;. This is not 
true in infinite dimensions. 

3. If H is a closed subgroup of G, then H is a Lie 
subgroup of G. This is not true in infinite 
dimensions. 

4. For any finite-dimensional Lie algebra a, there 
exists a connected Lie group G whose Lie algebra 
is Q, that is, such that q ~ T,G. This is not true in 
infinite dimensions. 


Some classical finite-dimensional examples of Lie 
groups are the matrix groups GL(m), SL(n), O(n), 
SO(n), U(z), SU(z),: Sp(z) with smooth group 
operations given by matrix multiplication and 
matrix inversion. 


Examples of Infinite-Dimensional 
Lie Groups 


Abelian Gauge Group G6 — (C" (M), +) 


Let M be a finite-dimensional manifold and let 
G— C*(M). With group operation being addition, 
that is, m(f,g) f +g, if) - —f,e—0. G is an 
abelian C* Frechet Lie group with Lie algebra 
q—T,C*(M)c C*(M), with trivial bracket 
[£n] 2 0, and exp = id. If one completes these spaces 
in the Ck-norm, k < oo then G* is a Banach Lie 
group, and if the H?-Sobolev norm is used with s > 
(1/2) dim M then G! is a Hilbert Lie group. 


Application of G=(C™(M), +) to Maxwell's equa- 
tions Let E, B be the electric and magnetic fields 
on R°; then Maxwells equations for a charge 
density p are: 


* 


E — curl B, B — —curl E [15] 


div B — 0, div E =p [16] 


Let A be the magnetic potential such that B — —curl A. 
As configuration space, we take V — Vec(R?), 
vector fields (potentials) on R?, so A € V, and as 
phase space, we have P=T*V ~ V x V* 3 (A, E), 
with the standard L^ pairing (A, E) = f A(x)E(x) dx, 
and canonical Poisson bracket given by [5], which 
becomes 


óFóH óH ôF 


As Hamiltonian, we take the total electromagnetic 
energy 


H(A, E) = ; | (el A|? + |E|?) dx 


Then Hamilton's equations in the canonical vari- 
ables A and E are 


A= =E c» B = —curl E 
and 
6H 
E= ——Ó ^ —curl curl A — curl B 


So the first two equations of Maxwell's equations [15] 
are Hamilton's equations, the third one is obtained 
automatically from the potential div B — —div curlA 
=Q and the fourth equation, div E— p, is obtained 
through the following symmetry (gauge invar- 
iance): the Lie group G—(C*(R?),4-) acts on V 
by p-A=A+Vy,~EG,AEV. The lifted action 
to VxV* becomes y-(A,E)=(A+Vy,E), and 
has the momentum map J:V x V* — q* œ {charge 
densities} 


J(A,E) = div E [18] 


With q— C*(R?) and a*—Den(R?), we identify 
the elements of q* with charge densities. The 
Hamiltonian H is G invariant, that is, H(y- 
(A, E) = H(A + Vy, E)= H(A, E). Then the reduced 
phase space for p € q* is 


(V x V), 2] (p)/G —-((E, B)|divE =p, divB =0} 


and the reduced Hamiltonian is 
1 f.» j 
HE B) = [üE «iP dx — Q9) 


The reduced Poisson bracket becomes, for any 
functions F, H on (V x V*) 


p? 
(F, H),(E, B) 
óF óH 6H óF 
= f ($E co zB sp OU a [20] 


and a straightforward computation shows that 


F - LE Hp}, 
-- curl B, Be —curl E 21 
divB=0, divE=p 


So, Maxwell’s equations [15], [16] form an infinite- 
dimensional Hamiltonian system on this reduced 
phase space with respect to the reduced Poisson 
bracket. 
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Abelian Gauge Group GC — (C* (M, R — {0}), -) 


Let M be a finite-dimensional manifold and let 
G— C*(M,R — {0}), the group operation being the 
multiplication, that is, »(f, g) ^ f - g, i(f) - f  ,e— 1. 
For k < oo, C*'(M, R — {0}) is open in C*(M, R), and 
if M is compact then C*(M, R — {0}) is a Banach Lie 
group. If s>(1/2)dim M then H*(M,R — (0]) is 
closed under multiplication, and if M is compact 
then H*(M, R — {0}) is a Hilbert Lie group. 


Nonabelian Gauge Groups G = (C*(M, G), -) 


The abelian example can be generalized by replacing 
R — {0} with any finite-dimensional (nonabelian) Lie 
group G. Let G=C*(M,G) with pointwise group 
operations m(f, 2)(x) —f(x)-g(x),x € M and i(f)(x) = 
(f(x)) !, where *-" and *(.)'!" are the operations 
in G. If k<oo then C*((M,G) is a Banach Lie 
group. Let g denote the Lie algebra of G, then the 
Lie algebra of G=C*(M,G) is a- C*(M,g), with 
pointwise Lie bracket [£,7](x) — [£(x), (x)], x € M, 
the latter bracket being the Lie bracket in g. 
The exponential map exp:g — G defines the 
exponential map EXP:q—C*(M,g) —^ G=C*(M,G), 
EXP(£) —expo£, which is a local diffeomorphism. 
The same holds for H:(M,G) if s» (1/2) dim M. 

Applications of these infinite-dimensional Lie 
groups are in gauge theories and quantum field 
theory, where they appear as groups of gauge 
transformations. 


Loop Groups G= C*(S',G) 


As a special case of the example above, we take 
M-S!, the circle. Then G=C*(S',G)=L*(G) is 
called a loop group and g = C*(S!, g) — /^(g) its loop 
algebra. They find applications in the theory of 
affine Lie algebras, Kac- Moody Lie algebras (central 
extensions), completely integrable systems, soliton 
equations (Toda, Korteweg-de Vries (KdV), 
Kadomtsev-Petviashvili (KP)), quantum field theory. 
Central extensions of Loop algebras are examples of 
infinite-dimensional Lie algebras which need not 


= have a corresponding Lie group. 


Diffeomorphism Groups 


Among the most important “classical” infinite- 
dimensional Lie groups are the diffeomorphism 
groups of manifolds. Their differential structure is 
not the one of a Banch Lie group as defined above. 
Nevertheless, they have important applications. 

Let M be a compact manifold (the noncompact 
case is technically much more complicated but 
similar results are true) and let G= Diff ^ (M) be 
the group of all smooth diffeomorphisms on M, 
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group operation being the composition, that is, 
m(f,g)—-fog,i(f)-f,e-idy. For C* diffeo- 
morphisms, Diff^(M) is a Frechet manifold and 
there are nontrivial problems with the notion 
of smooth maps between Frechet spaces. There is 
no canonical extension of the differential calculus 
from Banach spaces (same as for R") to Frechet 
spaces. One possibility is to generalize the notion 
of differentiability. For example, if we use the 
so-called C% differentiability, then G= Diff ^ (M) 
becomes a CP Lie group with CP differentiable 
group operations. These notions of differentiability 
are difficult to apply to concrete examples. 
Another possibility is to complete Diff^(M) in 
the Banach C^-norm, 0 < k < oc, or in the Sobolev 
H*-norm, s> (1/2) dim M; Diff*(M) and Diff*(M) 
become, in this case, Banach and Hilbert mani- 
folds, respectively. Then we consider the inverse 
limits of these Banach and Hilbert Lie groups, 
respectively: 


Diff*(M) = lim Diff*(M) [22] 


becomes the so-called inverse limit of Banach (ILB) 
Lie group, or with the Sobolev topologies 


Diff*(M) = lim Diff (M) [23] 


becomes the so-called inverse limit of Hilbert (ILH) 
Lie group. Nevertheless, the group operations are 
not smooth, but have the following differentiability 
properties. If the diffeomorphism group is equipped 
with the Sobolev  H'*-topology, then Diff (M) 
becomes a C* Hilbert manifold if s (1/2) dim M 
and the group multiplication 


m : Dif (M) x Diff(M) — Diff(M) [24] 


is C* differentiable; hence, for k—0, m is only 
continuous on Diff (M). The inversion 


i : Dif" (M) — Diff‘ (M) [2.5] 


is C* differentiable; hence, for k—0,; is only 
continuous on Diff (M). The same differentiability 
properties of m and i hold in the C* topology. This 
situation leads to the notion of nested Lie groups. 

The Lie algebra of Diff*(M) is given by 
q= T,Diff^(M) ~ Vec*(M), the space of smooth 
vector fields on M. Note that the space Vec(M) 
of all vector fields is a Lie algebra only for C* 
vector fields, but not for C* or H5 vector fields if 
k < œ,s « oo, because one loses derivatives by 
taking brackets. 

The exponential map on the diffeomorphism 
group is given as follows: for any vector field X € 
Vec* (M) take its flow y; € Diff" (M), then define 


EXP: Vec* (M) — Diff*(M):X 5, the flow at 
time 2 — 1. The exponential map EXP is not a local 
diffeomorphism; it is not even locally surjective. 

Applications of Diff^(M) occur in general rela- 
tivity, where the diffeomorphism group plays the 
role of a symmetry group of coordinate transforma- 
tions. Let (M,g) be a Lorentz 4-manifold. Then the 
vacuum Einstein's field equations are 


Ric(g) — 0 


These are invariant under coordinate transfor- 
mations, that is, under the action of Diff^ (M). 
Moreover, Einstein’s field equations form a 
Hamiltonian system on the space P= (metrics 
on MJ/Diff^ (M). 


Subgroups of Diff (M) 


Several subgroups of Diff^(M) have important 
applications. 


Group of volume-preserving diffeomorphisms Let 
p be a volume on M and G-— Diff" (M)— (f € 
Diff" (M)|f*u— u} the group of volume-preserving 
diffeomorphisms. Diff? (M) is a closed subgroup of 
Diff"(M) with Lie algebra g= Vec% (M) — [X € 
Vec*(M)|div, X =0} the space of divergence free 
vector fields on M. Vec% (M) is a Lie subalgebra of 
Vec* (M). 

Remark: We can neither apply the finite- 
dimensional theorem that if Vec* (M) is Lie algebra 
then there exists a Lie group whose Lie algebra it is; 
nor that if Diff (M) C Diff(M) is a closed subgroup 
then it is a Lie subgroup. 

Applications of Diff} (M) occur, for example, in 
fluid dynamics. Euler's equations for an incompres- 


sible fluid, 


aru Vu = - Vp, 


divu = 0 [26] 
are equivalent to the equations of geodesics on 
Diff? (M). 


Symplectomorphism group Let w be a symplectic 
2-from on M and G= Diff% (M) = (f € Diff^(M)| 
f*w=w} the group of canonical transformations (or 
symplectomorphisms). Diff^(M) is a closed sub- 
group of Diff ^ (M) with Lie algebra q = Vec? (M) = 
(X € Vec*(M)| Lx» —0] the space of locally 
Hamiltonian vector fields on M. Vec7 (M) is a Lie 
subalgebra of Vec* (M). 

Applications of symplectomorphism groups occur, 
for example, in plasma physics. Maxwell-Vlasov's 


equations for a plasma density f(x, v, t) generating 
the electric and magnetic fields E and B are 


Of 
= Te ay + E+ x B); = 0 
OB OE 27 
es —curl E, Dp = B -J [27] 
div E = py, div B = 0 


where J; and p; are the current and charge densities, 
respectively. This coupled nonlinear system of 
evolution equations is an  infinite-dimensional 
Hamiltonian system of the form F={F,H}, on the 
reduced phace space 


MY = (T*Dif£* (R9) x T*V)/C*(R$) [28] 


Pf 


(V is the same space as in the example of Maxwell’s 
equations) with respect to the following reduced 
Poisson bracket, which is induced via gauge sym- 
metry from the canonical Poisson bracket on 
T*Diff* (R^) x T*V: 


(F, G), (f, E, B) 


TON 


MICAT 

E th o 8E ^ 3B) 77 
SF 8fóG 6G Of oF 

+] Sey eR 


+ | (Soap mre 29) 


and with Hamiltonian 


H(f. E, B) -5 / v^f (x, v, t)dv 
1 2 2 
+3 [UEP +|BP)dx [30 


More complicated plasma, models are formulated 
as Hamiltonian systems. For example, for the 
two-fluid model the phase space is constituted by 
coadjoint orbits of the semidirect product (x) of the 
group G= Diff*(R ) x(C*(R5) x C*(R$)). For the 
MHD model: G= Diff* (RÉ) x(C*(R$) x Q?(R?)). 


The KdV Equation and Fourier Integral 
Operators 


There are many known examples of PDEs which are 
infinite-dimensional Hamiltonian systems, such as the 
Benjamin-Ono, Boussinesq, Harry Dym, KdV, and KP 
equations and others. In many cases, the Poisson 
structures and Hamiltonians are given ad boc on a 
formal level. This is illustrated here with the KdV 
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equation, where at least one of the three known 
Hamiltonian structures is well understood. 
The KdV equation 


Uy + 6utt, + uy, = 0 [31] 


is an infinite-dimensional Hamiltonian system with 
the Lie group of invertible Fourier integral operators 
being a symmetry group. Gardner found that with the 


bracket 
^r 6F ð 6G 
dim / fate ju 32 


and Hamiltonian 


2r 
H(u) = f (i? +4u2)dx [33] 
0 
u satisfies the KdV equation [31] if and only if 
a= (na, H} 


An important question concerns the origin of the 
Poisson bracket [32] and Hamiltonian [33]. It was 
shown earlier that this bracket is the Lie—Poisson 
bracket on a coadjoint orbit of Lie group G = FIO, the 
group of invertible Fourier integral operators on the 
circle $!. The latter is discussed briefly in the following. 

A Fourier integral operator on a compact mani- 
fold M is an operator 


A : C*(M) — C*(M) [34] 
locally given by 


A(u)(x) = (2) / / e^ »9a(x, £u(y)dyd£ [35] 


where y(x,y,€) is a phase function with certain 
properties and the symbol a(x, £) belongs to a certain 
symbol class. A pseudodifferential operator is a 
special kind of Fourier integral operators, locally of 
the form 


P(u)(x) = (27) ” JJ e p(x, £)u(y)dyde [36 


Denote by FIO and VDO the groups under composi- 
tion (operator product) of invertible Fourier integral 
operators and invertible pseudodifferential operators 
on M, respectively. Then we have the following results. 

Both groups VDO and FIO are smooth infinite- 
dimensional ILH Lie groups. The smoothness 
properties of the group operations (operator multi- 
plication and inversion) are similar to the case of 
diffeomorphism groups [24] and [25]. The Lie 
algebra of both ILH Lie groups VDO and FIO is 
the Lie algebra of all pseudodifferential operators 
under the commutator bracket. Moreover, FIO is a 
smooth infinite-dimensional principal fiber bundle 
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over the diffeomorphism group of canonical trans- 
formations Diff% (T*M — (0]) with structure group 
(gauge group) VDO. 

For the KdV equation, we take the special case 
where M — $!. Then the Gardner bracket [32] is the 
Lie-Poisson bracket on the coadjoint orbit of FIO 
through the Schrodinger operator P € VDO. Com- 
plete integrability of the KdV equation follows from 
the infinite system of conserved integrals in involu- 
tion given by H, — tr(P^); in particular, the Hamil- 
tonian [33] equals H =H). 


See also: Bi-Hamiltonian Methods in Soliton Theory; 
Functional Integration in Quantum Physics; Hamiltonian 
Fluid Dynamics; Hamiltonian Systems: Obstructions to 
Integrability; Korteweg-de Vries Equation and Other 
Modulation Equations; Symmetries and Conservation Laws. 
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Introduction 


Let X be a closed (connected, compact without 
boundary) smooth manifold of dimension 4, pro- 
vided with a Riemannian metric denoted by g. Let 
O^. denote space of smooth p-forms on X, that is, 
the sections of A" TX. The Hodge operator acting on 
p-forms, 


. OP 4—p 
* OY Oy 
satisfies «^ — (—1)^. In particular, * splits Q% into 
two subspaces (2 with eigenvalues +1: 


n 202g [1] 


Note also that this decomposition is an orthogonal 
one, with respect to the inner product: 


(w1, w2) = / wy A *w2 
X 


A 2-form w is said to be self-dual if *w—w and it 
is said to be anti-self-dual if xw = —w. Any 2-form w 
can be written as the sum 


w=w' +w 


of its self-dual w* and anti-self-dual w` components. 
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Now let E be a complex vector bundle over X as 
above, provided with a connection V, regarded as a 
C-linear operator 


V : T(E) T(E) 899} 
satisfying the Leibnitz rule: 
V(fa) =fVo+o®8df 


for all f € C*(X) and oe T(E). Its curvature 
Fy =VoV is a 2-form with values in End(E), that 
is, Fy €T(End(E)) @Q%, satisfying the Bianchi 
identity VFy =0. 

The Yang-Mills equation is 


V*Fy =0 [2] 


It is a second-order nonlinear equation on the 
connection V. It amounts to a nonabelian general- 
ization of Maxwell equations, to which it reduces 
when E is a line bundle; the four components of V 
are interpreted as the electric and magnetic 
potentials. 

An instanton on E is a smooth connection V 
whose curvature Fy is anti-self-dual as a 2-form, 
that is, it satisfies: 


Fe=0, thatis, *Fy =—Fy [3] 


The instanton equation is still nonlinear (it is linear 
only if E is a line bundle), but it is only first-order 
on the connection. 


Note that if Fy is either self-dual or anti-self-dual 
as a 2-form, then the Yang-Mills equation is 
automatically satisfied: 


‘Po = thy > V * Fy = EV Fy =0 


by the Bianchi identity. In other words, instantons 
are particular solutions of the Yang-Mills equation. 
Furthermore, while the Yang-Mills equation [2] 
makes sense over any Riemannian manifold, the 
instanton equation [3] is well defined only in 
dimension 4. 

A gauge transformation is a bundle automorphism 
g:E— E covering the identity. The set of all gauge 
transformations of a given bundle E— X forms a 
group through composition, called the gauge group 
and denoted by G(E). The gauge group acts on the 
set of all smooth connections on E by conjugation: 


g:-V=g Vg 


It is then easy to see that [3] is a gauge-invariant 
condition, since Fy.v —g Fyg. The anti-self-duality 
equation [3] is also conformally invariant: a con- 
formal change in the metric does not change the 
decomposition [1], so it preserves self-dual and 
anti-self-dual 2-forms. 

The topological charge k of the instanton V is 
defined by the integral 


1 
k = -gya | "(Fv AFv) 


= a(E) - 5 (E)! 4 
where the second equality follows from Chern-Weil 
theory. 

If X is a smooth, noncompact, complete Rieman- 
nian manifold, an instanton on X is an anti-self-dual 
connection for which the integral [4] converges. 
Note that, in this case, k as above need not be an 
integer; however, it is always expected to be 
quantized, that is, always a multiple of some fixed 
(rational) number which depends only on the base 
manifold X. 


Summary This note is organized as follows. 
After revisiting the variational approach to the 
anti-self-duality equation [3], we study instantons 
over the simplest possible Riemannian 4-manifold, 
R^ with the flat Euclidean metric. In the subse- 
quent sections, we present °t Hooft's explicit 
solutions, the ADHM construction, and its dimen- 
sional reductions to R°, R? and R. We conclude by 
explaining the construction of the central object of 
study in gauge theory, the instanton moduli 
spaces. 
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Variational Aspects of Yang-Mills 
Equation 


Given a fixed smooth vector bundle E — X, let .A(E) 
be the set of all (smooth) connections on E. The 
Yang-Mills functional is defined by 


YM: A(E)—R 


YM(V) = Fol: = flo Ase) OU 
M 

The Euler-Lagrange equation for this functional is 
exactly the Yang-Mills equation [2]. In particular, 
self-dual and anti-self-dual connections yield critical 
points of the Yang-Mills functional. 

Splitting the curvature into its self-dual and 
anti-self-dual parts, we have 


YM(V) = |[Fel[zs + ||Fs lr: 


It is then easy to see that every anti-self-dual 
connection V is an absolute minimum for the 
Yang-Mills functional, and that YM(V) coincides 
with the topological charge [4] of the instanton V 
times 877. 

One can construct, for various 4-manifolds but 
most interestingly for X—=S*, solutions of the 
Yang-Mills equations which are neither self-dual 
nor anti-self-dual. Such solutions do not minimize 
[5]. Indeed, at least for gauge group SU(2) or 
SU(3), it can be shown that there are no other 
local minima: any critical point which is neither 
self-dual nor anti-self-dual is unstable and must be 
a “saddle point" (Bourguignon and Lawson 


Jr. I981j. 


Instantons on Euclidean Space 


Let X— R^ with the flat Euclidean metric, and 
consider a Hermitian vector bundle E— R^. Any 
connection V on E is of the form d + A, where d 
denotes the usual de Rham operator and A € 
DI(End(E)) 9 Ol, is a 1-form with values in the 


dia l 
endomorphisms of E; this can be written as follows: 


4 
A=) 'A,dx*, A,:R'—u(r) 
k=l 


In the Euclidean coordinates x1, x2, x3, x4, the 
anti-self-duality equation [3] is given by 


Pig = Pog, Fag =E Fu = Pag 
where 
OA; i OA; 
i Ox; Ox; 


F; + [Aj, Aj] 
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The simplest explicit solution is the charge-1 SU(2) 
instanton on R^. The connection 1-form is given by 


pep mda) [6] 
where q is the quaternion q =x1 + xoi + x3j + x4k, 
while Im denotes the imaginary part of the product 
quaternion; we are regarding 1, j, k as a basis of the 
Lie algebra $u(2); from this, one can compute the 
curvature: 


2 
1 
Fá, = | ———5] - Im(dg ^d 7 
Ao (; n =| m( q q) | | 


We see that the action density function 


1 2 
FA^ = | ——; 
Fa (; : ") 


has a bell-shaped profile centered at the origin and 
decays like r^ 

Let Dy: R^ — R^ be the isometry given by the 
composition of a translation by y € R^ with a 
homothety by A€R*. The pullback connection 
t5 „Ao is still anti-self-dual; more explicitly, 


X 
X + |x — yl? 


» V 
FA, = | —————— |. -Im(dq ^ da) 
* ME — of 


Note that the action density function |F4|* has again 
a bell-shaped profile centered at y and decays like 
r ^; the parameter A measures the concentration of 
the energy density function, and can be interpreted 
as the “size” of the instanton A; ,. 

Instantons of topological charge k can be obtained 
by “superimposing” k basic instantons, via the so- 


A).y = D „A0 = Im(qdq) 


called °t Hooft ansatz. Consider the function 
p: R^ — R given by 
Y. 
p(x) 2 1- 
j=1 ix yj) 


where A, € R and y; € R^. Then the connection 
1-form A — A,,dx,, with coefficients 


A Ye (x)) [8] 


Xy 


is anti-self dual; here, F, are the matrices given by 
(u,v = 1,2, 3): 


7 
2 


1 - 
4i Cw tp] Tw => oy 


On = 


where øg, are the Pauli matrices. 


The connection [8] correspond to k instantons 
centered at points y; with size A; The basic 
instanton [6] is exactly (modulo gauge transforma- 
tion) what one obtains from [8] for the case k — 1. 
The "t Hooft instantons form a 5k-parameter family 
of anti-self-dual connections. 

SU(2) instantons are also the building blocks for 
instantons with general structure group (Bernard 
et al. 1977). Let G be a compact semisimple Lie group, 
with Lie algebra à. Let @: $1(2) — g be any injective 
Lie algebra homomorphism. If A is an anti-self-dual 
SU(2) connection 1-form, then it is easy to see that 
lA) is an anti-self-dual G-connection 1-form. Using 
[8] as an example, we have that 


A=i> ø (Gi) 5 


[iV 


T (p(x ))dx,, [9] 


is a G-instanton on R^. 

While this guarantees the existence of G-instan- 
tons on R^, note that the instanton [9] might be 
reducible (e.g., ó can simply be the obvious 
inclusion of $u(2) into $1(z) for any n) and that 
its charge depends on the choice of representation ¢. 
Furthermore, it is not clear whether every 
G-instanton can be obtained in this way, as the 
inclusion of a SU(2) instanton through some 
representation 9: 9$11(2) > qQ. 


The ADHM Construction 


All SU(r) instantons on R* can be obtained through 
a remarkable construction due to Atiyah, Drinfeld, 
Hitchin, and Manin. It starts by considering 
Hermitian vector spaces V and W of dimension c 
and r, respectively, and the following data (the so- 
called ADHM data): 


Bı, B) € End(V), i € Hom(W, V) 
j € Hom(V, W) 


Assume, moreover, that (B,,B2,i,/) satisfy the 


ADHM equations: 


[B,,B5] -- ij — 0 (10) 


[B1 , Bİ] + [B2,B}] - ii —7;— 0 [1] 
Now consider the following maps 


a:VxR'S(VoVoW)xR* 
8:(VoVoOoW)xR!'oVxR* 


given as follows (1 denotes the appropriate identity 


matrix): 
B; +21 
a(21,22) = | B2 +221 [12] 


] 


B(z1,272) = (-B2—z21 Bi-Fzil i) [13] 


where zı =X; + 1x2 and z2: =x3 +ix4 are complex 
coordinates on R*. The maps [12] and [13] should 
be understood as a family of linear maps parame- 
trized by points in R*. 

A straightforward calculation shows that the ADHM 
equation [10] implies that Ga — 0 for every (z1, 22) € 
R*. Therefore, the quotient E = ker 3/im o = ker 8 N 
ker at forms a complex vector bundle over R* or rank r 
whenever (B1, B5, i, j) is such that a is injective and (3 is 
surjective for every (z1, 22) € R*. 

To define a connection on E, note that E can be 
regarded as a sub-bundle of the trivial bundle (V & 
V 6 W) x R*. Solet :: E (V 6 V 6 W) x R^ be the 
inclusion, and let P:(V 6 V o W) x R* o E be the 
orthogonal projection onto E. We can then define a 
connection V on E through the projection formula 


Vs = Pdi(s) 
where d denotes the trivial connection on the trivial 
bundle (V & V @ W) x R*. 
To see that this connection is anti-self-dual, note 
that projection P can be written as follows: 


P=1-Die'pD 
where 


D:(V@V@W) x R*=(VeV) x R* 


p= (a) 


and E — DD!. Note that D is:surjective, so that E is 
indeed invertible. Moreover, it also follows from 
[11] that 38! = ata, so that €! = (881) 11. 

The curvature Fy is given by 


Fy =P(d(1— Dj2-'D)a) = P(dD'z- (dp) 


= P((dD')&"! (dD) + D'd(='(dD)) 
— (dD')& '(dD) 


for P(D'd(£-(dD))) 20 on E= kerD. Since Z~ is 
diagonal, we conclude that Fy is proportional to 
d'D! ^ dD, as a 2-form. 

It is then a straightforward calculation to show 
that each entry of dD' A dD belongs to 0?^-. 

The extraordinary accomplishment of Atiyah, Drin- 
feld, Hitchin, and Manin was to show that every 
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instanton, up to gauge equivalence, can be obtained in 
this way (see, e.g., Donaldson and Kronheimer 1990). 
For instance, the basic SU(2) instanton |6] is associated 
with the following data (c — 1,r — 2): 


Bes i (1). j-(0 1) 


Remark The ADHM data (B1, B5,i,j) are said to 
be stable if 3 is surjective for every (z1,22) € R^, and 
it is said to be costable if a is injective for every 
(21,22) € R*. (By, Bo, i, j) is regular if it is both stable 
and costable. The quotient: 


{regular solutions of (10) and (11)}/U(V) 


coincides with the moduli space of instantons 
of rank r 2 dim W and charge c — dim V on R* (see 
below). It is also an example of a quiver variety (see 
Finite Dimensional Algebras and Quivers), asso- 
ciated to the quiver consisting of two vertices V and 
W, two loop-edges on the vertex V and two edges 
linking V to W, one in each direction. 


Dimensional Reductions of the 
Anti-Self-Dual Yang-Mills Equation 


As pointed out above, a connection on a Hermitian 
vector bundle E — R* of rank r can be regarded as 
1-form 


..,X4)dx*, A,:Ríou(r) 


Assuming that the connection components A, are 
invariant under translation in one direction, say x4, 
we can think of 


3 
A= 2. Ak (x1, x2, x3)dx* 
k= 


as a connection on a Hermitian vector bundle over 
R^, with the fourth component ó-—4A4 being 
regarded as a bundle endomorphism 6$ó:E- E, 
called a Higgs field. In this way, the anti-self-duality 
equation [3] reduces to the so-called Bogomolny (or 
monopole) equation: 


where x is the Euclidean Hodge star in dimension 3. 
Now assume that the connection components A, 


are invariant under translation in two directions, say 
x3 and x4. Consider 


2 
A= Y Ag(x1,x2)dx* 
k=1 
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as a connection on a Hermitian vector bundle over 
R?, with the third and fourth components combined 
into a complex bundle endomorphism: 


$ = (As +1- A4)(dx1 — i - dx2) 


The anti-self-duality 
reduced to the so-called 


taking values on 1-forms. 
equation. [3] is then 
Hitchin's equations: 

049 = 0 [15] 


F4 = [6,4], 


Conformal invariance of the anti-self-duality equa- 
tion means that Hitchin's equations are well defined 
over any Riemann surface. 

Finally, assume that the connection components A, 
are invariant under translation in three directions, say 
x2,x3, and x4. After gauging away the first compo- 
nent A4, the anti-self-duality equations [3] reduce to 
the so-called Nahm’s equations: 


dT 
= 5 sur, Ti - 0 j,k, = {2,3,4} [16 


where each T; is regarded as a map R — u(r). 

Readers who are interested in monopoles and 
Nahm's equations are referred to the survey 
by Murray (2002) and references therein. The best 
source for Hitchin’s equations still are Hitchin’s 
(1987a, b) original papers. A beautiful duality, 
known as Nahm transform, relates the various 
reductions of the anti-self-duality equation to periodic 
instantons; see the survey article by Jardim (2004). 

It is also worth mentioning the book by Mason 
and Woodhouse (1996), where other interesting 
dimensional reductions of the anti-self-duality equa- 
tion are discussed, providing a deep relation 
between instantons and the general theory of 
integrable systems. 


The Instanton Moduli Space 


Now fix a rank-r complex vector bundle E over a 
four-dimensional Riemannian manifold X. Observe 
that the difference between any two connections is a 
linear operator: 


(V — V) (fao) =fVo+o-df —fVico—o-df 


-F(—V)s 


In other words, any two connections on E differ by 
an endomorphism-valued 1-form. Therefore, the set 
of all smooth connections on E, denoted by A(E), 


has the structure of an affine space over 
l'(End(E)) & Q4. 


The gauge group G(E) acts on .A(E) via 
conjugation: 
g.V:—g Vg 
We can form the quotient set B(E) E)/G(E 


which is the set of gauge beri SÉ fe se : 
connections on E. 

The set of gauge equivalence classes of anti-self- 
dual connections on E is a subset of B(E), and it is 
called the moduli space of instantons on E — X. The 
subset of .Mxx(E) consisting of irreducible anti-self- 
dual connections is denoted MS(E). 

Since the choice of a particular vector bundle 
within its topological class is immaterial, these sets 
are usually labeled by the topological invariants 
(Chern or Pontrjagyn classes) of the bundle E. For 
instance, M(r,k) denotes the moduli space of 
instantons on a rank-r complex vector bundle 
E— X with cqi(E) 20 and c;(E)— k > 0. It turns 
out that Mx(E) can be given the structure of a 
Hausdorff topological space. In general, M x(E) will 
be singular as a differentiable manifold, but M(E) 
can always be given the structure of a smooth 
Riemannian manifold. 

We start by explaining the notion of a LS vector 
bundle. Recall that L5(R") denotes the éomeleios 
of the space of smooth functions f:R"— C with 
respect to the norm: 


ifii; = f CAP + af? +--+ FP 


In dimension 7 — 4 and for p > 2, by virtue of the 
Sobolev embedding theorem, L? consists of continu- 
ous functions, i.e., L;(R") C C*(R"). So we define 
the notion of a L vector bundle as a RA 
vector bundle whose transition functions are in L5. 
where p > 2. 

Now for a fixed L? vector bundle E over X, we can 
consider the metric space A,(E) of all connections on 
E which can be represented locally on an open subset 
UcXasa LYU ) 1-form. In this topology, the subset 
of ireducible connections A;(E) becomes an open 
dense subset of A,(E). Since any topological vector 
bundle admits a compatible smooth structure, we may 
regard L connections as those that differ from a 
smooth edi by a Ls 1-form. In other words, 
A,(E) becomes an «fine: space modeled over the 
Hilbert space of L 1-forms with values in the 
endomorphisms of E. The curvature of a connection 
in A,(E) then becomes a LS 4 2-form with values in 
the endomorphism bundle End(E). 

Moreover, let Gp (E ) be defined as the topolo- 
gical group of all L5 bundle automorphisms. By 
virtue of the Sobolev multiplication theorem, 
G,41(E) has the structure of an infinite-dimensional 


Lie group modeled on a Hilbert space; its Lie 
algebra is the space of ER sections of End(E). 

The Sobolev multiplication theorem is once again 
invoked to guarantee that the action Ģp+ı(E) x 
Aj(E) — Ap(E) is a smooth map of Hilbert mani- 
folds. The quotient space B,(E)=A,(E)/Gp+1(E) 
inherits a topological structure; it is a metric (hence 
Hausdorff) topological space. Therefore, the sub- 
space Mx(E) of B,(E) is also a Hausdorff topolo- 
gical space; moreover, one can show that the 
topology of Mx(E) does not depend on p. 

The quotient space B,(E) fails to be a Hilbert 
manifold because in general the action of Gp+1 (E) on 
Aj (E) is not free. Indeed, if A is a connection on a 
rank-r complex vector bundle E over a connected 
base manifold X, which is associated with a 
principal G-bundle. Then the isotropy group of A 
within the gauge group 


DA = {8 € Gp (E)lg(A) = Aj 


is isomorphic to the centralizer of the holonomy 
group of A within G. 

This means that the subspace of irreducible connec- 
tions A (E) can be equivalently defined as the open 
dense subset of A,(E) consisting of those connections 
whose isotropy group is minimal, that is, 


A; (E) = (A € Aj(E)|TA = center(G)] 


Now Gp +i(E) acts with constant isotropy on Aj (E); 
hence, the quotient B (E) = Á (E)/Gp+1 (E) acquires 
the structure of a smooth Hilbert manifold. 


Remark The analysis of neighborhoods of points 
in B (E)\B5(E) is very relevant for applications of 
the instanton moduli spaces to differential topology. 
The simplest situation occurs when A is an SU(2) 
connection on a rank-2 complex vector bundle E 
which reduces to a pair of U(1) and such [A] occurs 
as an isolated point in By(E) AB; (E). Then a 
neighborhood of [A] in B,(E) looks like a cone on 
an infinite-dimensional complex projective space. 


Alternatively, the instanton moduli space MM x(E) 
can also be described by first taking the subset of all 
anti-self-dual connections and then taking the 
quotient under the action of the gauge group. 
More precisely, consider the map 


p : Ap(E) > L2(End(E) & QY*) 47 
p(A) = FA 


Thus, p '(0) is exactly the set of all anti-self-dual 
connections. It is G,, ;(E)-invariant, so we can take 
the quotient to get 


Mx(E) = p*(0)/Gp+1(E) 
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It follows that the subspace My(E) = Bj (E) 
My(E) has the structure of a smooth Hilbert 
manifold. Index theory comes into play to show 
that M(E) is finite dimensional. Recall that if D is 
an elliptic operator on a vector bundle over a 
compact manifold, then D is Fredholm (i.e., ker D 
and coker D are finite dimensional) and its index 


ind D — dim ker D — dim coker D 


can be computed in terms of topological invariants, 
as prescribed by the Atiyah-Singer index theorem. 
The goal here is to identify the tangent space of 
M(E) with the kernel of an elliptic operator. 

It is clear that, for each A € .A,(E), the tangent 
space TA. A,(E) is just L^ (End(E) & QL). We define 
the pairing 


(a,b) = | anb 18] 
X 


and it is easy to see that this pairing defines a 
Riemannian metric (the so-called L?-metric) on Ap(E). 

The derivative of the map p in [17] at the point A 
is given by 

dj : L;(End(E) & 2%) > L7 ,(End(E) & 9%) 
at (daa) 

so that for each A € p^! (0) we have 

Tap (0) = la € L'(End(E)) & Q} | dja = o} 


Now for a gauge equivalence class [A] € 55(E), the 
tangent space T|AjB;(E) consists of those 1-forms 
which are orthogonal to the fibers of the principal 
Gp+1(E) bundle A (E) — B5 (E). At a point A € Ap(E), 
the derivative of the action by some g € Gp+1(E) is 


—da : L5, 4(End(E)) ^ L;(End(E) 8 2x) 


Usual Hodge decomposition gives us that there is an 
orthogonal decomposition: 


L;(End(E) & 2x) = im d4 @ ker d? 
which means that: 
TaB; (E) = la € L'(End(E) & 01) | dja = o} 


Thus, the pairing [18] also defines a Riemannian 
metric on B5 (E). Putting these together, we conclude 
that the space Tj4)My tangent to M}(E) at an 
equivalence class [A] of anti-self-dual connections 
can be described as follows: 


TA ME) 
= la € L2(End(E) @ 01) |d3a = dia = 0} [19] 
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It turns out that the so-called deformation operator 
ða — d^ ® dA: 


ôa : Lj (End(E) & Qx) 
> L2, ,(End(E)) 6 L2 , (End(E) & 03) 


is elliptic. Moreover, if A is anti-self-dual then coker 
ôa is empty, so that Tja M%(E)= keró4. The 
dimension of the tangent space Tja M}%(E) is then 
simply given by the index of the deformation 
operator 64. Using the Atiyah-Singer index theorem, 
we have for SU(r) bundles with c2(E) =k: 


dim MX(E) = 4rk — (r — 1)(1 — bi (X) + b, (X)) 


The dimension formula for arbitrary gauge group G 
can be found in Atiyah et al. (1978). 

For example, the moduli space of SU(2) instantons 
on R^ of charge k is a smooth Riemannian manifold 
of dimension 8k — 3. These parameters are inter- 
preted as the 5k parameters describing the positions 
and sizes of k separate instantons, plus 3(k — 1) 
parameters describing their relative SU(2) phases. 

The detailed construction of the instanton moduli 
spaces can be found in Donaldson and Kronheimer 
(1990). An alternative source is Morgan's lecture 
notes (Friedman and Morgan 1998). It is interesting 
to note that MS (E) inherits many of the geometrical 
properties of the original manifold X. Most notably, 
if X is a Kahler manifold, then M}(E) is also 
Kahler; if X is a hyper-Kahler manifold, then MS (E) 
is also hyper-Káhler. One expects that other 
geometric structures on X can also be transferred 
to the instanton moduli spaces Mj(E). 


See also: Characteristic Classes; Finite-Dimensional 
Algebras and Quivers; Gauge Theoretic Invariants 

of 4-Manifolds; Gauge Theory: Mathematical 
Applications; Integrable Systems: Overview; Index 
Theorems; Moduli Spaces: An Introduction; Solitons and 
Other Extended Field Configurations; Twistor Theory: 
Some Applications [in Integrable Systems, Complex 
Geometry and String Theory]. 
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Introduction 


The notion of integrability plays many different rôles 
in quantum field theory (QFT). In this article we 
interpret it in a narrow sense and describe some QFTs 
that are completely integrable, in the sense that there 
are as many integrals of motion as degrees of freedom. 
Necessarily this implies, since we are talking about 
field theories, that there is an infinite number of 
conserved quantities. The existence of such a tower of 
conserved quantities of increasing Lorentz spin 
implies, via the Coleman—Mandula theorem, that the 
theories are trivial in spacetime dimensions greater 


than 2. On the other hand, in 1 + 1 dimensions there is 
a rich menagerie of such integrable quantum field 
theories (IQFTs). These theories are fascinating in their 
own right as nontrivial QFTs for which data like the 
S-matrix and spectrum can be determined exactly. We 
will describe these exact S-matrices for a series of 
seminal examples. In addition, we briefly describe the 
applications of these theories to statistical systems in 
two dimensions. 


Classical Integrable Systems and 
Field Theories 


For a field theory to be integrable it must have an 
infinite number of conserved charges. Necessarily 
these must be spacetime symmetries which extend the 
Poincaré symmetry in some way. It turns out that, due 
to a theorem of Coleman and Mandula, such 


extensions are very restrictive: they are only possible in 
1 + 1 dimensions (one dimension of space and one of 
time) apart from noninteracting theories. Below we 
describe some of the most important examples. 


Affine Toda Theories 


These theories describe the interactions of a set of 
scalar fields which we write as a vector 9. The action is 


s= [ex(5i 0,0)" -vie) 1] 


The potential has to be very specially chosen in 
order that the resulting theory is integrable. The 
resulting theories are classified by affine Lie alge- 
bras. We shall describe only the theories related to a 
simply laced Lie algebra g (so of ADE type). In this 
case, for the affine version of the theory, 


gg 23 n, ent 2] 


where ó is an r-rank g vector and @,,a=1,...,7r, are 
a set of simple roots of g. The fact that we are 
considering the affine version of the theory means 
that we include the term involving the extended root 
(the lowest root) @ = —» 7, ,7,0,, which defines 
the integers na(no = 1). If this term is absent then the 
potential does not have a minimum. Such nonaffine 
theories are interesting in their own right since they 
include the Liouville theory, but we shall not 
describe them here. 

One way to expose the infinite set of conserved 
charges at the classical level is to write the equations 
of motion in Lax form. This has the form of the 
vanishing of the field strength, or zero-curvature 
condition, of an auxiliary gauge connection in g 
with components (A,, A;): 


-—Ó "s : Ba, 9/2 
Ax = 6-H +57) e^ (ea + fa) 


a=0 


"mr [3] 
3a,0/2 
7 ) | e (ea — fa) 
208 pm 


A; = 0,09 -H + 


Here, fe; fi} are related to generators of g in a 
Cartan-Weyl basis, via 
fa = Bw. 

fo = Ea 
where z is a auxiliary variable known as the spectral 


parameter and h is the Coxeter number of g. Using 
the following commutators of g, 


[Ea,,, Ea,| = nd 0450; -H 
H, Es] = Ee [5] 
Eis E a,l = 0 


€ = 2ER 


—| 
ep =g "Eas; 
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it is straightforward to verify that the zero-curvature 
condition 


Fy: = OA, - O, Ax + [Ax, Ai] = 0 [6] 


is equivalent to the equations of motion which 
follow from extremizing the action [1]. 

The fact that there exists a flat connection which 
depends on an auxiliary parameter z is sufficient to 
ensure integrability. In brief, the idea is that the 
gauge connection can be “abelianized” by a gauge 
transformation: 


A, = Uð,U ! + UA,U^ with [Á,A,] =0 [7] 


Hence, 0,A, — 0,A;=0. This can be done in two 
inequivalent ways, such that A,, are polynomials in z 
and z !, respectively. The corresponding coefficients 
are then conserved currents whose integrals give 
conserved charges. It can be shown that for the 
Toda theories these conserved charges have Lorentz 
spin given by an exponent {s,} of g modulo its 
Coxeter number ^: 


Aps k=n+1, 11,2,3,...,] 

D,: b-2n-2, (1,3,5,...,2n— 3,n— 1) 

Ee: h= dz, f 14,5, 7,9, 11! [8] 
E;: b-18,  11,5,7,9,11,13,17) 

Eg: b=30, [1,7,11,13,17,19,23,291 


This spectrum of conserved quantities seems to be a 
ubiquitous feature of IQFTs. These theories can be 
generalized by replacing g, or rather its (untwisted 
affinization) with any affine algebra. 


The Sinh/Sine-Gordon Theory 


These theories are the simplest of the Toda theories 
described above, associated to the Lie algebra Aj. In 
this case there is a single field and the potential has 
the form 


V(¢) = m (e^ + e-99) [9] 


20" 

We have rescaled the field by 1/2 relative to the 
normalization in [2]. This potential defines the “sinh- 
Gordon theory." However, we can also take 3 — i to 
give the sine-Gordon theory with an action 


m [es (5 9,0) + J n) [10] 


The sine-Gordon theory is a useful paradigm for 
IQFTs because it exhibits most of the features of 
more complicated examples. To start with, it illus- 
trates another important property of some integrable 
systems; namely, the existence of solitons. In the sine- 
Gordon case, the minima of the potential lie at 
= 2nn/ 0, for an integer n, so there is a topological 
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kink that separates a vacuum 7 on the left and n + 1 
on the right, as well as an antikink. The explicit 
solution for the kink moving with velocity v is 


4 
$ó(x,1) = gan exp(m(xcosh@—tsinh@—€)) [11] 
where £ is a constant and, since we are working in 
1 + 1 dimensions, we have introduced the rapidity 6, 
in terms of which the velocity is 


v = tanh 6, —oo < 0 € oo [12] 


The antikink solution is simply the negative of the 
above. The kinks have a mass 


M=—~ [13] 


The existence of topological solitons is not a 
consequence of integrability, per se, for example, the 
ó^ theory in 1+1 dimensions also has kinks; 
however, in the integrable setting, the solitons have 
special properties that survive in the quantum theory. 
The first property is that multisoliton solutions can be 
found exactly using a variety of different techniques. 
They are most easily written down using the tau 
function, which is related to the field via 


¢ = — — log — [14] 


The N-soliton solution can then be written com- 
pactly as 


N N 
r= > exp (ra t ` ud) [15] 


[up }=0,1 p=1 p.q=l 


The sum is over the 2 possibilities for which zp = 0 
or 1, for each p, and we have introduced 


P) = m(x cosh 0, — t sinh 6, — £p) i [16] 


The rapidity of the pth soliton is 0,, and the choice 
of sign corresponds to the kink and antikink, 
respectively. The “interaction coefficient” is 


exp Y? = tanh’ (1 (6, — 6,)) [17] 
For example, the two-soliton solution is. 
reda aor ai. ÉÍU d p E E [18] 


The multisoliton solutions have a natural physical 
interpretation as the histories of a set of solitons 
which scatter off each other. To make this more 
precise, consider the two-soliton solution [18] in 
more detail. Suppose that £1 < £»,v4 > v2. Focus on 
the solution in the vicinity of the first soliton, that is, 


Sy, 


^ 
- 
- 
+ 
- 
- 
- 


Figure 1 Classical scattering of a kink and an antikink. The 
final velocities equal the initial velocities and the only effect is to 
introduce a velocity-dependent time delay as shown. 


x ~ vit -- €. In the limit £t — —oo, the solution is 
approximately 


Tode? [19] 


while, as ? — oo, it is approximately 
Tc e]? (1 + al [20] 


In both the limits, the solution represents an isolated 
soliton, the only difference is that the final “position 
offset" has been displaced: € é — Y. It is a 
consequence of integrability that the solitons inter- 
act in such a simple way. There were two solitons in 
the initial configuration and two in the final 
configuration traveling with the same velocities. 
The only effect is to introduce a time delay of 


Y(0) 
— — 1 
m m sinh(0/2) 21] 
in the center-of-mass frame with 04 = —05 = 0/2, 


which we illustrate in Figure 1. We shall see that this 
kind of simple scattering is a characteristic feature of 
integrable field theories which extends to the 
quantum theory. It reflects the enormous restriction 
that the existence of the infinite set of integrals of 
motion puts on the dynamics. 


Integrability at the Quantum Level 


In this section we turn to the particular implications 
of integrability for the field theories at the quantum 
level. In discussing theories in 1 4- 1 dimensions it is 
convenient, as in [12], to use the rapidity. The 
energy and momentum of a particle of mass m are 
E — m cosh 0 and p =m sinh 6, respectively. 

The sinh- and sine-Gordon theory, and their affine 
Toda generalizations, are scalar field theories with a 
well-behaved potential and as such they can be 
quantized in the conventional manner. It can be 
shown that integrability survives quantization and we 
now address its consequences. The key observation is 
that having an infinite set of higher-spin conserved 
quantities is very restrictive on the possible quantum 
processes. Assuming that the theory has a mass gap, 
the asymptotic states |a, 0) are particles with rapidity 


0 and additional quantum numbers needed to specify 
the state are indicated by the label a. These states are 
eigenstates of the conserved charges, 


O;la. 0) = q.(a)e* |a, 0) [22] 


Here, s is the spin of the vnd which ranges over 
some infinite subset of the integers. Since the charges 
must commute with the S-matrix, it follows imme- 
diately that if an incoming state of n particles has a 
set of rapidities (0;,...,0,) then the outgoing state 
must also have n particles with the same set 
(01,...,0,): there is consequently no particle crea- 
tion! For example, we have illustrated the scattering 
of two particles in Figure 2. The two-particle 
S-matrix element will be denoted as 


S4(01 — 02): |a,01; 5,02) —]|c,02;d,04) — [23] 


Note that masses of the incoming particles must match 
the outgoing ones: 77, —71, and m,=m,. We have 
already seen this kind of behavior with the classical 
scattering of solitons in the sine-Gordon theory. In 
spite of the fact that the scattering is purely elastic, it 
can be nontrivial for two reasons: if there are mass 
degeneracies in the theory, the quantum numbers 
{a,,...,@,} can change and, in addition, the S-matrix 
element can depend nontrivially on the momenta. 
The fact that the incoming and outgoing states 
have the same set of momenta leads to the notion of 
factorizability. To see what this means, consider the 
case of three particles. Let us imagine that we 
prepare the initial state to consist of three fairly 
narrow wave packets in position space with 
momenta smeared in accordance with the uncer- 
tainty principle. The key to the following argument 
is the fact that the infinite set of higher-spin 
conserved charges (with commute with the S-matrix) 
allow one to move the positions of the three 
particles relative to each other in an arbitrary way. 
In addition, the theory has a mass gap, so interac- 
tions have a finite range. By using this freedom, we 
can arrange for particles 1 and 2 to interact first, 


a 
b 

Figure 2 The two particle S-matrix with particles a and b in the 

initial state and c and d in the final state. For consistency, 

ma = mg and mp = Me. 
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Figure 3 The scattering of three particles can factorize in two 
distinct ways as illustrated, leading to a nontrivial condition: the 
Yang-Baxter equation. 


well before they come within interaction range of 
the third. Subsequently, the first two particles 
interact with the third as on the right-hand side of 
Figure 3. This ability to move the wave packets 
around using the symmetries means that the three- 
particle S-matrix element must “factorize” into a 
product of three two-particle elements: 
S% (1, 05, 03) 


abc 


9 35540 


ghi 


02)S (01 — 03)S%(02 — 65) [24] 


However, we could also use the symmetries afforded 
by the conserved charges to shift the positions of the 
particles so that particle 2 and 3 interact first, as on 
the left-hand side of Figure 3. Since the charges 
commute with the S-matrix, the result must the 


same; hence, there is a nontrivial consistency 
condition: 

Y > Shi (02 — 63): (01 — 03)S (02 — 03) 

ghi 


0; — 02)S}_(01 — 03)S¥ (02 — 65) [25] 


= S0 


ghi 


This is the celebrated Yang—Baxter equation. Notice 
that it is only nontrivial if there are mass degen- 
eracies, otherwise the particles on internal lines are 
determined by the external particles. 

The factorization of the S-matrix extends readily to 
the case of more particles in an obvious way. An 
n-body element factorizes into a two-body element 
for each pair of particles. One might think that 
considerations of the z-particle S-matrix would lead 
to additional constraints; however, it can readily be 
shown that this is not the case and that the Yang- 
Baxter equation acts as a basic *move" which allows 
one to reorder the n-particle S-matrix into an 
arbitrary order. Further conditions on the S-matrix 
come from the axioms of analytic S-matrix theory: 


(1) Unitarity 


ef cd 
> S(O) Sep 


ef 


[- 0) = OacObd [26] 
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(ii) Crossing symmetry Each particle a has an 


antiparticle a and 


Sap (0) = SiG (mi — 0) [27] 

(iii) Analyticity The S-matrix is a meromorphic 
function of 0 on the physical strip, 0 < Im@ < m. 
Singularities in most instances occur along the 
imaginary axis and the simple poles correspond to 
direct or cross-channel resonances. In this case, if 
sd œ (0) has a simple pole at 0— iu^, (necessarily a 
nonphysical rapidity difference) in the direct channel 
there exists a bound state of a and b of mass 


m2 = m + m; + 2m4my cos us, [28] 


The situation is illustrated in Figure 4. The new 
particle must itself be included in the particle spectrum. 
The S-matrix elements at the -— have the form 


-Erat AL ITE [29] 


where P^, can be thought of as a kind of projection 
operator with 


S 


2. P uit. zl [3 0] 


Unitarity of the QFT requires that 7^, is real and 
positive, although there are also examples of 
nonunitarity theories with exact S-matrices. If 
ab —c can occur then so can ac— b and bc a. 
From [28], we deduce the following identity: 


uU. + ub. J- Uj, = 27 [31] 


The data (u4,] for any given scattering theory are 
known as the fusing angles. 

(iv) The Bootstrap equations ‘These give a non- 
linear relation between S-matrix elements. The basic 
idea is that if particle c appears as a resonance in the 
scattering of a and b then the S-matrix element of c 
with another state d can be deduced in terms of the 
scattering of d with a and b. This is illustrated in 
Figure 5. Using [30], we can write the resulting 
equation for the S-matrix element of c and d directly: 


SNC (0 


stes), (2 
ghi 


Figure 4 Near a direct channel pole, the scattering of a and b 
is dominated by the bound state c. 


Figure 5 The bootstrap equations result from considering the 
interaction of a particle d with the bound state c of a and b in two 
distinct ways as illustrated. 


The bootstrap constraints are very powerful because 
they allow one to extract the $-matrix elements of new 
particles that appear as bound states. This leads to the 
philosophy of the “bootstrap program” where one 
attempts to build consistent $-matrices starting from 
the S-matrix for a subset of particles which act as a 
seed for the algorithm. The process is quite an art, but 
at the end one has to be satisfied that the complete 
analytic structure is consistent with all the axioms. The 
key is to be able to account for all the poles in a 
consistent way, either in terms of bound states, as 
above, or in terms of the Coleman-Thun mechanism. 
This allows some poles to be interpreted in ways other 
than the existence of a bound state. The bootstrap 
algorithm is very complicated in general and at the 
present time a complete classification of solutions is 
not known. However, there are a large number of 
known solutions which appear to be intimately related 
to Lie algebras and associated structures known as 
Yangians and quantum groups. Below we describe 
some of the simplest known solutions. 


Minimal S-Matrices 


These scattering theories are in some sense the 
simplest. The particle spectrum is generally non- 
degenerate and so the Yang-Baxter equation is 
trivial. As is ubiquitous in the subject of IQFT, the 
classification of the theories is related to Lie 
algebras, although what seems to be important is 
not so much the algebra in question but rather the 
details of the associated root system. In this case the 
appropriate algebras are the simply laced algebras of 
ADE type. The number of particles is equal to the 
rank r of the Lie algebra and the masses are given by 
the r elements of one of the eigenvectors of the 
Cartan matrix of the algebra g: 


E C fi = (2 — 2.cos 7.) Ma [33] 


b=] 


where h is the Coxeter number of g. The conserved 
charges have spins corresponding to the exponents 
of g modulo 5. We briefly explain how the complete 


$-matrix can be written down in terms of properties 
of the root system of g. Let ® be the set of roots of g, 
and @,,a=1,...,r, a set of simple roots, as in the 
last section. In terms of these, Cap — 20, - &)/a7. Let 
@,,a4=1,...,r, be a corresponding set of funda- 
mental weights, Œa - @p = 6,,. 

Key to defining the theories is the notation of the 
Weyl group of g, the group generated by reflections 
in the simple roots: 


2 - 0, 


a 
2 a 
O^ 


Ra (ax) —u— [34] 
The element w= R,R2---R, is known as a Coxeter 
element of the Weyl group, and it has special 
properties that are significant in the present context. 
In particular, its eigenvalues are of the form 
exp (271s,/h), where h is the Coxeter number of g 
and the integers s, are the exponents of the algebra 
as in [8]. Note that there is always a pair with sı = 1 
and s, — h — 1. Clearly, w acts as a rotation in the 
two-dimensional space spanned by the two corre- 
sponding eigenvectors. We can define an antisym- 
metric function z(&,) on roots to be h/r times the 
(signed) angle between the projections of œ and f 
onto this two-dimensional eigenspace. In prepara- 
tion for what follows, it is useful to also define the 
roots 


$, = R Re +>» Ray1 (a) [35] 


We can now present P Dorey's amazingly compact 
formula for the complete S-matrix. For the scatter- 
ing of particle a with particle b, 


Sap(0) = J [0 (6,5)? [36] 
Bel, 


In this formula T, is the set of positive roots of g 
which lie in the orbit of @, under w. We have also 


defined the building block 
{x} = (x + 1)(x — 1) 


"TUN: 

a p 37] 
e) 76_inx\ 

PEL: 

sinh (5 ~ 75) 


The fusing rules are also particularly elegant in 
the language of root systems. There is a three-point 
coupling between a;,i=1,2,3, if there exist three 
roots æ” €T, such that œ” +a” +a®=0. 
Furthermore, the fusing occurs in the 24,4? channel 
at rapidity difference 


aya? 


7 I T uļa™, a 38] 


This is Dorey's fusing rule. 
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For the case of A, 4, the S-matrices are particu- 
larly simple. The mass spectrum is 


m, = msin~—, a —1,...,n—1 [39] 
and Dorey's rule gives the possible fusings as 


ab — (a+ b)mod n, which occur at the rapidity 
values 


T a+b<n 


[40] 


) a+b>n 


The charge conjugation operator maps a— a—7n—a 
and the explicit form for the S-matrix elements is 


S,,(8) ={a+b—1}{a+b—3}---{la—b] +1} [41] 


The element $,,(0) has one direct channel pole at 
0—iu,y corresponding to the exchange of the 
particle a+b mod n, and a cross-channel pole at 
0—iu,; corresponding to the exchange of particle 
a — b mod n. 


Affine Toda Theories 


The bootstrap program has been solved for all the 
affine Toda theories. For the simply laced theories 
described earlier, the result is directly related to 
the minimal S-matrices constructed above. The 
only difference is that there are additional factors 
which depend on the coupling 8 of the Toda 
theory but which do not introduce any additional 
poles onto the physical strip. These CDD factors 
are included by simply changing the basic building 
block [37]: 


(x + 1)(x — 1) 


(x—-1-BYxt1-B) | 


{x} =e {x} toda = 
where 


1 p 

BS T4 Hs 43] 

The S-matrix structure for the Toda theories 
based on the nonsimply laced algebras is a good 
deal more complicated. Integrability is only main- 
tained in the quantum theory if the ratios of the 
physical masses of the particles depend on the 
coupling constant 8 is some very special way. 


The Sine-Gordon Theory 


We have seen that the sine-Gordon theory has 
solitons at the classical level. At the quantum level, 
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Figure 6 Soliton scattering processes. s and s' are the kink 
and antikink, respectively, or vice versa. 


we expect that these kinks become bona fide particle 
states, in addition to the particle corresponding to 
the quantum fluctuations of the field ¢. Focusing on 
the solitons, we expect a degenerate doublet 
corresponding to the kink and antikink. For the 
scattering of two solitons, there are six allowed 
processes illustrated in Figure 6. Unitarity [26] leads 
to the constraints 


5(6)S(—6) — 1 
ST(8)S$1(—0) + Sr(0)Sr(—0) = 1 [44] 
ST(0)Sn (—0) + S&(0)81(—0) = 0 


while crossing symmetry [27] (using the fact that the 
soliton and antisoliton are antiparticles) gives 


= ST(0), 


By themselves, these constraints are rather mild; 
however, the complete soliton $-matrix must also 
satisfy the Yang-Baxter equation [25]. The solu- 
tion to all the constraints is not unique, however, 
the Zamolodchikovs conjectured that the exact 
answer is 


Slin — 0) Sr(im —0)=Sp(0) [45] 


s(6) = L sinh (= ig 9) U(6) 
$:(0) = Z sinh (= ) U(6) (46 


Sa(0) = - sin (=) U(8) 


S,(0) = 


X 
E] 
/-1 sin (C 32 


[47] 


r(1 + (2n'— DT i) 
7 F 


where y= 8^(1 — Bl. The reason for confi- 
dence in the conjecture is that from the soliton 
S-matrix one can complete the bootstrap program 
and account for all the poles in terms of particles in 
the theory. In particular, there is a finite set of 
bound states of the soliton and antisoliton, called 
breathers, with masses 


m = 2M sin =, k-12,... «7 [48] 
Here, M is the soliton mass. The bootstrap 


equations give the S-matrix for the scattering of a 
soliton or antisoliton with the kth breather, 


sinh 6 + icos = 
io) = ————— 


ky 
h — i cos — 
sin icos-- 


us k — 2j a .0 
= TU) 
— BAS 
3 


sinh? 9+ isin (5 Y) sinh @ + isin (554) 
sinh? 8 — isin (555) sinh — isin (54) 
k—1—2j 

|-1 Sin (ELE 


—-1-2j 


n. sf RET 2 6 
"y 2 COS — 32 ey E 


H in? k—2j T_e ii 
"oa" a2 
while, for the scattering of breather k with /, 
16 
+ 5) cos? (och + i5) 
2 32 2 50) 


where we assume, without loss of generality, that 
k > l. The remarkable thing is that the scattering of 
the lowest-mass breather mı with itself, 


sinh 0 4- ising 
hy = ac [51] 
sinh 0 — ising 


is precisely the Toda S-matrix for Ay with 8 — i8/ v2 
(the origin of the factor of /2 is mentioned after eqn 
[9]). This uniquely identifies the lowest-mass breather 
as being the quantum of the ó field. 

The quantum structure that we have described 
above can be directly related to the classical 
scattering of solitons. In order to implement the 
classical limit, we can reintroduce 5 which is 
achieved by replacing 9? by /?b. In this limit, the 
$-matrix elements have the form 


S(@) = exp 5 (6(0) + O(b)) [52] 


The phase 6(@) is related via the WKB approxima- 
tion to the time delay in the classical theory of 
soliton scattering via 


ó 
6(0) = const. +f d0' M sinh(0/2) At(0) [53] 
Jo 


where Af(@) is the time delay in the center of mass 
(21). It is possible to verify [53] for the processes 
S(8) and Sr(0). Note that the reflection process has 
no classical analogue. 


IQFT, Conformal Field Theories and 
Statistical Systems 


We have described some IQFTs and their factoriz- 
able S-matrices in theories with a mass gap. We can 
ask the question, “what: happens at very high 
energies compared with all the mass scales?" For a 
generic QFT such a limit may not exist, however, 
for a special class of theories the limit is a massless 
scale-invariant theory corresponding to a fixed point 
of the renormalization group. The massive theory 
can be thought of as a deformation of the massless 
theory by a particular relevant operator. At the fixed 
point, the Poincaré symmetry is enhanced to the full 
conformal group in the appropriate number of 
dimensions and the resulting theory is known as a 
conformal field theory (CFT). In 1-- 1 dimensions 
the conformal group is infinite dimensional and so 
many CFTs are themselves integrable, in the sense 
that the complete spectrum of fields is known and 
their correlation functions can be constructed. 
Hence, an alternative way of thinking about many 
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IQFTs is as a perturbation of a CFT by a specific 
relevant operator: 


Sigrt = Scrr + 7! d^xO(x) [54] 


We will suppose that the operator has conformal 
dimensions (A, A). This description of the theory 
can be turned around to ask the following question: 
which relevant deformations of a given CFT lead to 
IQFTs? Remarkably, since CFTs are so well under- 
stood, the question can often be answered exactly. 
The idea is that the conserved quantities of a CFT 
are all (anti-)holomorphic with respect to a holo- 
morphic coordinate z — x + it. Conserved quantities 
include the stress tensor of spin 2 but include, in 
addition, an infinite tower of currents of ever 
increasing spin (T;]. After perturbation, one has 


OT, = gR! +---4+ g" RV) 4... [55] 


The conformal dimensions of the R'” are (s — (1 — A), 
1—n(1—A)). Since the conformal dimensions of 
fields in a CFT are bounded below by zero, it follows 
that the series on the right-hand side truncates. The 
question of whether T; remains conserved away from 
the CFT boils down to the question as to whether the 
right-hand side has the form O90, for some ©. 
Zamolodchikov found an ingenious counting argu- 
ment which showed in certain circumstances that the 
right-hand side has precisely this form for some s > 2. 
This is sufficient to establish that the perturbed theory 
is an IQFT. In certain cases the spectrum of spins of 
the conserved quantities that are established by the 
counting argument is enough to make a connection 
with a known factorizable S-matrix. 

This way of viewing IQFT as perturbations of CFTs 
is especially fruitful when we make the connection 
of the Euclidean QFT with the classical statistical 
mechanics of a two-dimensional system. In this 
connection, the Feynman path integral is reinterpreted 
as the sum over the configurations in the canonical 
ensemble with the Euclidean action interpreted as the 
energy. Usually, we consider statistical systems which 
are discrete, so typically defined on a lattice. The 
Euclidean QFTs are to be thought of as these statistical 
systems in the continuum limit where the lattice spacing 
is taken to zero keeping the long-range physics fixed. 
CFTs which have no massive degrees of freedom are 
identified with points of second-order phase transitions 
in the statistical system where correlation lengths are 
infinite. Perturbations of CFTs by relevant operators 
correspond to taking the statistical system away from 
criticality by changing some external parameter. 

The prototypical example of such a statistical 
system is the Ising model. In the lattice version of 
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this model, there are a set spins {o;} at each lattice 
site which can take the discrete values +1. The 
partition function of the theory is 


Z(H, T) = 5 ' exp C T^! 3 ' aig; — Ha) [56] 
(4) i 


(ei) 


The Ising model is the simplest model of a ferro- 
magnet, where T is the temperature and H is the 
external applied field. The theory has a second-order 
phase transition for T =T, the Curie temperature, 
and H — 0 when the competition between the energy, 
which favors aligning the spins, and entropy, which 
favors disorder, exactly balance. In the two-dimen- 
sional neighborhood of the critical point, the lattice 
theory admits a continuum limit which can be 
described as the perturbation of a CFT, describing 
the critical Ising model, by a pair of relevant operators 
with couplings T — T; and H. In the case of the Ising 
model, the CFT is simply the theory of a free massless 
fermion in two-dimensional Euclidean space. 

It turns out that in the two-dimensional space 
of relevant perturbations, there are two directions 
which lead to IQFTs. The most obvious is changing 
the temperature away from T, while keeping H — 0. 
This leads to a particularly simple IQFT, that of a 
free massive fermion. More unexpectedly, the direc- 
tion for which H varies away from 0, but T = Ta, 
also leads to an IQFT. In this case, Zamolodchikov's 
counting argument shows that there are higher-spin 
conserved charges of spin including 


s = 1,7,11,13,17,19,... [57] 


This is remarkable because, as we have described 
previously, there is a minimal solution of the 
bootstrap program that describes the scattering of 
eight particles which has a spectrum of conserved 
charges that includes these spins. It is the minimal 
scattering theory related to the algebra Eg. 

The fact that the scattering theory of the off- 
critical Ising model in the magnetic field direction 
has been identified is remarkable. From the S-matrix 
one can proceed to investigate the off-critical corre- 
lation functions using a technique known as the 
*form factor programe." Detailed simulation of the 
original lattice model [56] has provided strong 
support for the veracity of the Eg scattering theory. 
For instance, the two lightest masses in the scatter- 
ing theory determine the ratio of the two longest 
correlation lengths m/m = 2 cos (7/5). 

In general, the identification of an IQFT and the CFT 
at its ultraviolet limit can be more difficult to establish. 
One way to proceed is to use the thermodynamic Bethe 
ansatz. This technique involves considering the ther- 
modynamics of a gas of the particles in a periodic box. 
Since the scattering is purely elastic, thermodynamic 


quantities can be calculated, albeit in terms of a set of 
coupled nonlinear integral equations. If the box is small 
enough, ultraviolet effects dominate and various 
features of the CFT can be recovered. 


Other IQFTs 


There is a rich menagerie of other IQFTs that we 
have no space to discuss in detail. One is sigma 
models, whose fields take values in a Riemannian 
target space Pt with an action 


S= / dxg,,0, X^0" X" [58] 


where g,;d X^ dX^ is the metric of M. These theories 
are integrable at the classical level if the target space 
is either a group manifold of a compact simple 
group G or a symmetric space coset G/H, where H 
is a suitable subgroup of G. The former are known 
as the "principal chiral models." There are two 
kinds of conserved quantities, both local and 
nonlocal. At the quantum level, the conserved 
currents which imply classical integrability can be 
subject to quantum anomalies. An analysis of these 
anomalies proves that the principal chiral models 
are all integrable at the quantum level, while only 
the subset of symmetric space coset models, namely 


SO(n--1)/SO(n),  SU(n)/SO(n) 
SUQn)/Sp(m,  SO(2n)/SO(n) x SO(n) [59] 
Sp(2n)/Sp(n) x Sp(n) 


are quantum integrable. S-matrices have been proposed 
for all these integrable sigma models. They have a more 
complicated structure than most of the cases discussed 
here, because the particles fall into representations of the 
associated Lie groups and the Yang-Baxter equation, 
such as for the sine-Gordon solitons, is now nontrivial. 
Remarkably, gross features of the S-matrices, such as the 
mass spectrum fusing rules, are identical to the Toda 
theories or the minimal S-matrices. 

Returning to IQFTs that are associsted with 
deformations of CFTs, there are more general 
classes which are associated with the renormaliza- 
tion group trajectories between two nontrivial fixed 
points. These theories have both massless and 
massive degrees of freedom. Even more remarkable 
are the staircase models of Zamolodchikov that 
exhibit an infinite series of crossover behavior where 
the renormalization group trajectory passes close to 
an infinite series of fixed points in sequence. 

For all of the theories described above, one might 
have thought more generally that integrability is a 
very rigid property of a theory. In general, for 
example, the number of external coupling constants 
is very limited and the mass ratios are all fixed. For 


example, in Toda theories there is only an overall 
mass scale m and the coupling 8. If the form of the 
potential is altered in any way then integrability is 
lost. However, in certain circumstances, integrability 
appears to be a looser constraint that allows more 
flexibility. One class of such theories is known as 
the homogeneous sine-Gordon theories. These are 
integrable deformations of gauged WZW models 
associated with the coset G/U(1)’, where r is the 
rank of a simple compact group G. In these theories 
there is a rich spectrum of both stable and unstable 
particles with masses and an S-matrix that depends 
continuously on a set of r coupling constants. 
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Discrete Dynamical Systems 


The expression “dynamical system” usually refers to 
a coupled system of ordinary differential equations 


(ODEs), namely, 


SLE) = PEE, iss... 


where t belongs to some set of nonzero measure I of 
the real line R, typically an interval [a,b] or a 
semiline or the whole line, and x; are sufficiently 
smooth functions from I to R or to C. 

The system [1] is complemented by initial or 
boundary conditions that make it into an “initial- 
value" or a *boundary-value" problem. Under suitable 
regularity assumptions on the RHS, the existence and 
uniqueness of the solution of the initial-value problem 


j=1,...,N W 
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is guaranteed, but in most cases the solution can be 
known only *approximately" either through perturba- 
tion theory or just through numerical integration. This 
is not the proper place to discuss finite-difference 
schemes for systems of ODEs: what is relevant is that 
such numerical schemes (think, e.g., of Euler or 
Runge-Kutta schemes) “discretize” the continuous 
independent variable t by replacing it by an integer 
variable n € Z: in the simplest case, the interval [a, b] 
is replaced by a set of L equally spaced points £, =a + 
n(b—a)/L(n—1,...,L), the first derivative is 
approximated by a (forward) difference, and the 
system [1] is converted into a system of “difference” 
equations of the form 


x;ín--1)-osx(n)-d- bF(n,x1(n),...,xN(m) 12] 


where h denotes the time step (b — a)/L. 

The coupled system [2] is an example of a “discrete 
dynamical system," explicit (because the updated 
variables only depend upon the values taken 
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at previous discrete times), first order (only “nearest- 
neighbor” discrete times, n, n+ 1 are involved), but 
nonautonomous, as the RHS is allowed to depend 
explicitly upon the independent variable n, analo- 
gously to its continuum counterpart. 

In the following, *autonomous" but not necessa- 
rily explicit discrete dynamical systems of a special 
type will be considered: in fact, we will require them 
to be equipped with a Hamiltonian structure, and 
we will define the notion of complete integrability 
(in the Arnol’d—Liouville sense) for such systems. 

This article emphasizes on some aspects and 
properties of integrable discrete systems, neglecting 
others that could be equally important. In particular, 
as no nonautonomous discrete systems will be 
considered, discrete analogs of Painleve’ equations 
will never be discussed in this article, and conse- 
quently the intriguing issues concerning “singularity 
confinement" in the discrete and "algebraic entropy" 
will not be touched upon (see, e.g., Grammaticos 
et al. (2004)). Similarly, neither the integrability for 
discrete systems in multidimensional space nor 
"quantum integrable mappings" will be discussed. 


Lagrangian and Hamiltonian 
Formulations 


Following the historical path along which modern 
classical mechanics has been developed, first the 
concept of a Lagrangian map is introduced, and then 
Hamiltonian (in fact, symplectic) maps are defined 
through a proper discrete version of the Legendre 
transformation. 

Let x;(1)(j—1,..., N,n € Z) be N sequences of 
real numbers and let Z(x,y) be a smooth function 
from RF x R^ into the reals, x denoting the N-tuple 
X1,...,XN. L is regarded as a “discrete Lagrange 
function”: corresponding to each discrete time 7, it 
is assigned a certain value £,:— Z(x(n), x(n + 1)). 
The corresponding discrete action functional S[L] is 
defined in a natural way: 


i= c. E 


The actual *discrete trajectory" will be given by 
the sequence x(z) that corresponds to a “critical 
point" of the action [3] subject to the constraints 
óx(N4) — 6x(N,)=0. Note that the values N, (Np) 
may well possibly coincide with —oo (4-oc). Such 
"critical points" are given by the solution of the 
discrete Euler-Lagrange equations: 


OL OL 
= p= =0 [4 
Ox; x= (n) y =x (n+) OY; x;—x;(n—1).y;—x;(11) 


It is worthwhile to remark the intrinsic nature of 
eqns [4], whose form turns out to be independent of 
the choice of a coordinate chart. In fact, by omitting 
the explicit dependence on n and simply denoting 
x(n)—x, x(n + 1) 2x, x(n — 1) 2x, [4] can be cast 
in the form 


Vi£(x, x) + V2£(x, x) = 0 [5] 


which makes its “implicit” nature for the updated 
variable x more transparent. Clearly, as a map from 
the pair (x,x) to the pair (x, x), it is in general a 
multivalued map, or a “correspondence”, as it is 
called in the literature (Suris 2003, Veselov 1991). 
In order that [5] be solvable for x, the Hessian 
matrix H; = 0^ £/OxjOy, should be nondegenerate. 

As will be noted shortly, the Lagrangian map [4] 
(or [5]) is in fact a canonical, or better a symplectic 
transformation on a suitably defined cotangent 
bundle T*X to the configuration space X € RN. 
Namely, one defines the conjugate momentum to x as 


p := VaL(x,x) [6] 
so that [5] can be rewritten as the following system: 
p--Vii(xXx) [7] 
p = V2L(x, x) [8] 


This system defines a correspondence (x, p) — (x, p), 
which is indeed a “symplectic” one, as it preserves 
the standard symplectic form w(x,p)= 55 , dp; ^ 
dx;, and, of course, the associated Poisson brackets. 
The simplest way to recognize this property is by 
constructing the generating function of the corre- 
sponding canonical transformation. To this end, let 
us introduce 


N 
S(x,p) — —£ - V p,(% — xj) [9] 
ji 


The discrete Euler-Lagrange equation then takes the 
form 


: OS 
Xj — Xj = Op, [10] 
! 
7 OS 
Pi = Pi = 9. [11] 


which is canonically generated by S + 5 7; x(j)p(j). A 
strict analog of the Hamiltonian formulation for 
continuous-time Lagrangian systems does not indeed 
exist in the discrete-time case. One of the main 
consequences, well known to the specialists but 
worth emphasizing in the present context, is that 
even a symplectic map in one degree of freedom 


(two-dimensional T* X) is generically not integrable: 
the existence of an invariant function F(x,p)— 
F(x,p) is not entailed by the symplectic structure, 
so that, as discussed later, integrable maps of the 
standard type are indeed exceptional. On the other 
hand, note that invariant functions do exist when- 
ever a Lagrangian has some additional symmetry: 
this is the case when a Lie group acts on the 
configuration space X and the Lagrange function is 
invariant under its induced action on X x X, so that 
a discrete version of the Noether theorem applies 
(Suris 2003). 


Complete Integrability 


The definition of a *completely integrable" discrete- 
time system is now in order. Let be a symplectic 
map on the  2N-dimensional phase space 
M :—(R^^,dp^ dq), equipped with N smooth 
invariant functions F;, such that 


€ F,,...,Fx are functionally independent, that is, 
their gradients VF; are linearly independent of M; 
e F,,..., Fx are in involution: 


{F,F,}=0, j,k=1,...,N 


Let 7 be a connected component of the common 
level set 


{(x,p) ET: F(x, p) — ck, R=1,...,N} 


Then T is diffeomorphic to T! x R^, for some 0 < 
| € N; if T is compact, then it is diffeomorphic to an 
N-dimensional torus T™. 

In the compact case, there exists an open ball Q € 
RN such that, in 7 x Q, there exist new canonical 
coordinates (I4, 0,), E — 1,..., N; I, € 7,6, € Q, the 
so-called action-angle coordinates, enjoying the 
following properties: 


e the actions I, depend just on the F;'s 
e in action-angle coordinates the map is a linear 
shift on the N-dimensional torus: 


I, = (I4) = I, 


~ 


bp = Ddk) = be + Bhl h,..., IN) 


Hence, in action-angle variables a completely integr- 
able map is a canonical transformation from (J, à) to 
(I(—1),9), whose generating function W only 
depends on the action variables. It takes the form 


l,—1,—0 [12] 


i OW OB f* 
h-h- aoa] dx; pj(I, x) [13] 


integrable Discrete Systems 61 


Integrable Maps of the Standard Type 


As the simplest integrable models, first consider 
some highly nontrivial examples of “standard 
maps,” that is, scalar discrete second-order differ- 


ence equations of the following type (Suris 2003): 
Xy41 — 2X + Xn-1 = G(Xn; b) [14] 


with 4 a real parameter, which exhibit an invariant 
function, say 


J (%n—1,%n) = Jn Xn + 1) [15] 


Clearly, [14] can serve as a discretization of the 
Newtonian equation: 


x = f(x) [16] 


if lim, o P^ G(x; b) exists and is equal to f(x). 
All “standard maps” are Lagrangian, 
stationary points of the discrete action: 


f= (3 hic xx d V(xn;h)) [17] 


ncZ 


being 


with G(x;b)-—OV(x;b)/Ox. A point in the phase 
space is a pair X,,p,—x,4,-— x,.1, and [14] is 
symplectic for dp ^ dx, reading 


Xn41 — Xn = Dn4A [18] 


Pn — Pn+1 = G(x,; b) [19] 


The corresponding generating function is given by 
S= V(x; b) + (1/2)p2,,. Integrability of [19] means 
the existence of a function F from M into itself such that 


F(Xn+1,Pn+1) = F(Xn Pn) [20] 


where [15] and [20] 
J(x, x M y) = F(x, y). 
Suris has found three families of functions G that 
ensure integrability: a rational family, a trigonometric 
family, and a hyperbolic family. There is no room here 
to display the relevant formulas, nor to explain why, 
under natural analiticity assumptions both in h and x, 
no other integrable family exists. However, it is worth 
mentioning that they turn out to be integrable 
discretizations of the scalar second-order differential 
equations [16] for the following “force” functions f (x): 


frat(x) = A+ Bx + Crt + DX [21] 


are equivalent provided 


firig(x) = Asin(wx) + Bcos(wx) + Csin(w2x) 
+ Dcos(w2x) [22] 


fhyp(x) = Aexp(x) + Bexp(—x) + Cexp(2x) 
+ D exp(—2x) [23] 


A curious fact is that those Newton forces that 
one can “discretize” in order to get integrable maps 
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are exactly the external forces that one can add to 
the internal two-body interactions of the Calogero- 
Moser or Calogero-Sutherland models to preserve 
complete integrability. 


Integrable Discrete Systems and 
the Lax Approach 


Since, in a seminal paper, Lax (1968) introduced it 
for the Korteweg-de Vries (KdV) equation, the 
search for a “Lax representation" played a crucial 
role in the construction of integrable systems, both 
finite and infinite dimensional. In particular, the 
continuous time dynamical system [1] (assumed to 
be autonomous) is said to be equipped with a Lax 
representation if there exist two matrices L, M 
whose entries depend upon the coordinates x;, 
whenceforth upon the time 7, such that the time 
evolution [1] can be cast in the form 


L(t) = [L(£), MQ) [24] 


Hence, the one-parameter family of matrices L(t) 
undergoes the “isospectral” deformation: 


L(t) = U(2)L(0)(U(1)) ' [25] 


U(t) being the unique solution of the linear matrix 
differential equation: 


U(t) = M(t)U(t) [26] 


with the initial condition U(0)=I. Then, the 
existence of a Lax representation in term of, say, 
k x k matrices entails the existence of k integrals of 
motion, given, for instance, by the eigenvalues of 
L(t), or by the traces t; :— tr(L(t))'. 

Some remarks are in order: 


e [n the case of a Hamiltonian system, the matrices L, 
M depend, of course, on the point in the phase space. 
e No guarantee exists, a priori, that the eigenvalues 
of L, or equivalently the traces tj, be “sufficiently 
many" and in involution. Note, however, that in 
many examples the Lax matrices L, M depend on 
an extra scalar parameter A (so that they are 
elements of an affine or “loop” Lie algebra), 
which might increase the number of integrals of 
motion well beyond the dimension of the matrix. 


The N-body systems of Calogero type and Toda 
type are celebrated examples of integrable dynami- 
cal systems equipped with a Lax representation. 
How this description can be adapted to the 
discrete-time case? The isospectral equation [25] 
suggests the proper way. One has to look for two 
matrices depending on the coordinates (or on the 
phase-space variables) x (again, they can be called L, 


M), such that the discrete-time evolution, modeled, 
for instance, by [2], can be cast in the form of a 
similarity transformation: 


L= MLM”! [27] 


where L=L(x), L=L(x), and M=M(x,x). As 
usual, by denoting by n the discrete time (i.e., the 
number of iterations), so that x ^ x(n), x — x(n + 1), 


eqn [27] implies that a discrete version of [25] 
holds: 


L(n) = U(n)L(0)[U(n)] ^ [28] 


where U(n):= M(n)M(n — 1)--- M(1). 

As in the continuous case, the existence of a 
discrete Lax representation entails the existence of 
conserved quantities (invariants of the map or of the 
correspondence) but by itself it does not say 
anything about completeness and involutivity of 
such invariants. There is, however, an approach that 
incorporates the involutivity property in the very 
construction of Lax equations, both discrete and 
continuous, namely the “R-matrix approach.” 
Indeed, from the experimental observation of a 
number of examples, both finite and infinite dimen- 
sional, one can assert that the matrix M taking part 
in the “continuous” Lax representation [24] may be 
presented in the form (Suris 2003) 


M = R(f(L)) |29] 


In [29], L, M are element of some matrix Lie algebra 
g, R is a linear map from g into itself, and f is a 
conjugation-covariant function, namely 


f(ALA^!) = Af(L)A^! 30) 


A being an arbitrary element of the group G with 
Lie algebra g. 

Polynomials in the variable L with scalar coeffi- 
cients are typical examples of conjugation-covariant 
functions. Moreover, in a matrix Lie algebra, one 
can identify g with its dual space g* through the 
nondegenerate bilinear form provided by the trace: 
(Li, L2) :— tr(L4L5). Then, the trace F of a conjuga- 
tion-covariant function f will be a typical example of 
a conjugation-invariant function, and, conversely, 
the gradient of a conjugation-invariant function F, 
defined as 


d ! 
(VF, X) = 7, FUL + eX) [31] 


will be a typical example of a conjugation-covariant 
function. In the above setting, one can define the 
following Lie-Poisson bracket on g: 


(F, G}(L) := (L, [VF, VG]) [32] 


where F, G are arbitrary (ie., not necessarily 
invariant) functions from g into C, so that the 
Hamilton equation 


L — (H,L) [33] 


takes the Lax form 


L = [L, VH] [34] 


It is immediate to check that invariant functions of 
L are Casimir functions of [32] so that they will not 
generate any nontrivial flow. 

Assume now that the linear mapping R, usually 
called r-matrix, introduced in [29], is such that it 
defines a new Lie bracket on g, through the formula 


Li, Lo]g = 3(Li, R(L2)] + [R(L1),L2]) — [35] 
and consequently a new Lie-Poisson bracket 
{F, G}r(L) := (L, [VF,VG]k) ^ [36 
Then the following theorem holds: 


Let H be an invariant function on g. Then: 


(i) The Hamilton equations on g generated by H with 
respect to the Poisson bracket [36] have the Lax 
form 


L = [L, R(VH)] [37] 


(ii) The invariants of g, that is, the Casimir function 
of the standard Lie-Poisson bracket [32], are in 
involution for [36] so that the corresponding 
flows are mutually commuting. 


A particular realization of such R operator, very 
important for the application, arises in the so-called 
Adler-Kostant-Symes (AKS) construction (Adler 
1979, Kostant 1979, Symes 1980), where the Lie 
algebra g admits a decomposition in two subalgebras, 
g, and g , so that, as linear spaces, it holds that 


g—-8g,08. [38] 


Denoting by m+ the corresponding projections, the 
linear mapping 


R := t} — T- [39] 


defines a new Lie bracket on g, and the correspond- 
ing Lax equations take the two equivalent forms: 


L = [L,x4(f(L))] =-[L,7-(F(L))] [40] 


For the present purposes, it is of paramount 
importance that the AKS construction has a discrete- 
time version (Suris 2003). 

In fact, let G be a Lie group with Lie algebra g, and let 
G+, G_ be its subgroups having g,,g as Lie algebras. 
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Then, in a certain component of the identity element J, 
any element g of G is uniquely factorizable as 


g =II,(g)II_(g), lIl«(g) € G4 [41] 


Moreover, let F:g — G be a conjugation-covariant 
function. Consider now the map 


L > L:- W,' (F(L)) - L - TL, (F(L)) 
-IL(F(L)-L-H-(F(L) [42] 


and regard it as a difference equation, yielding 
L= L(n + 1) in terms of L= L(n). Then, the follow- 
ing properties hold: 


è For whatever function F, the map [42] commutes 
with any continuous flow [40], mapping solutions 
into solutions. 

e It can be “explicitly integrated" with respect to 
the discrete time n, yielding 


L(n) = I; ' (F"(Lo)) - Lo - IL. (F"(Lo)) [43] 


or the equivalent expression in terms of the 
complementary projection II... 

e [t is interpolated by the continuous flow [40] with 
time step P if 


exp(hf(L)) = F(L) e f(L) =h log(F(L)) [44] 


In other words, the discrete-time systems that one 
derives through this approach are just a sequence 
of pictures taken at equally spaced times of some 
continuous flow pertaining to the hierarchy [40]: 
so, by construction they are Poisson maps with an 
involutive family of integrals given by the con- 
jugation-invariant functions of L (typically, tr L"). 
e As far as 


F(L) = I + bf(L) + o(b?) [45] 


the map [42] serves as an integrable exact 
discretization of the flow [40], sharing both its 
Poisson structure and its constants of the 
motion. 


An Integrable Discretization of the 
Toda Lattice 


Consider a simple but an illuminating example of 
the above construction, showing an integrable 
discretization of the “open-end Toda lattice," 
which is described (Suris 2003) by the Newtonian 
equations of motion: 
Xj = exp(Xj41 — xj) — exp(x; — xj-1) 

j 58 luos uu N [46] 
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and can be cast into a Hamiltonian form by setting 
pj —Xj; qj; — xj. If, according to H Flaschka (1974), 
one introduces the variables 


b= %j, = a = expl% — x) [47] 


eqn [46] takes the form 


bj-aj-aji, a=aj(bir—b) [48] 


and enjoys the Lax representation [24] in terms of 
the N x N matrices: 


N N N 
L(a, b) — 3 GE a 3 b;E;;4 X. E [49 
k=1 k=1 k=] 


" j [50] 
M(a,b) =-B:= bE + V Egg 
BT k—1 


In the above formula, E; , is the matrix having 1 in the 
jk position and O elsewhere, so that, obviously, 
EN.N41 = En+1,N =Q. An inspection to [49] and [50] 
shows that A is just the strictly upper triangular part of 
L(a, b), while B is its lower triangular part. The pair 
(A, B) constitutes the so-called LU decomposition of 
L(a, b). One is clearly in the AKS setting, the Lie algebra 
g being just the algebra of N x N matrices, and the Lie 
subalgebras g, being the strictly upper and lower 
triangular matrices. The tridiagonal matrix L(a, b) 
belongs to a Poisson submanifold of g, invariant under 
the flows [40], and a complete family of commuting 
integrals of motion is given, for instance, by I, =trL*. 
Now, the elements of the group GLy, realized as 
the group of invertible N x N matrices, uniquely 
factorize into a product of an invertible lower- 
triangular matrix times an upper-triangular matrix 
with units on the diagonal, and the Lie algebras of 
those subgroups are just the aforementioned sub- 
algebras g,. Then, one is naturally tempted to look 
for an integrable discretization provided by a 
conjugation-covariant function of the type [45], 
starting with the simplest possible choice, namely 


F(L) = I + bf (L) 
Setting 
L(a, b) :— L(à, b) 
= IE; (I + BL) - L- IL, (I + bL) 
—TL(I--bL)-L-II^(1--bL) [51] 


it turns out that the matrix equation [51] is 
equivalent to the map 


(a, b) — (à, b) 


described by the following equations: 
b, — b es - - 
dabas Be Da 
Ay = Ap (Beri — Be) 


where B, which are the “field variables” entering 
into the LU factorization [51], are explicitly and 
uniquely defined by the recurrent relation (amount- 
ing to a finite continued fraction): 


By =1+hb, je, 
De 
As ay = 0, the initial condition is simply 91 = 1 + hb,. 
It follows from the general results of the previous 
section that [51] is an integrable Poisson map, sharing 
with the continuous Toda hierarchy both the Poisson 
structure and the integrals of motion. Its initial-value 
problem can be uniquely solved in terms of the LU 
factorization of the group element (I+ bLo)", the 
initial condition Lo being any matrix pertaining to 
the tridiagonal submanifold [49]. According to [44], 
the interpolating Hamiltonian flow is provided by the 
function f(L)=h'log(1+hL). To make contact 
with the discussion in the section “Lagrangian and 
Hamiltonian formulations," we observe that, in terms 
of the canonical variables x;, p;, the discrete Toda [51] 
lattice becomes the following symplectic map: 


k=l. N BJ 


1+ bp; = exp(x; — x;) + b^exp(x;—X;4) [53] 


1 + bp; —exp(x;—x;)-- b^exp(xui— xX;) [54] 


It can evidently be written in the discrete Newtonian 
form: 


exp (x; = 25) — expla; — x) 


= b^ exp(xj41 — xj) — exp(x; — Xj 1) [55] 


whose Lagrangian function is given by 


N N 
£ — M W(X, — x) — b 3 exp(xii — Xk) [56] 
kei 


k=] 


with 


w(€) = h^ (exp(£) — 1 — £) [57] 


The variables 5; acquire the following extremely 
simple expression in the Lagrangian coordinates x;, x;: 


B; = exp(x; — xj) 


For integrable Hamiltonian systems with long- 
range two-body interaction, such as Calogero- 
Moser type systems, and their so-called relativistic 
version (Ruijsenaars systems), an exact integrable 
discretization has also been found. However, at least 


in the more natural Lax representation, the related 
R-matrix is dynamical (namely, it depends on the 
phase-space coordinates), and the simple factoriza- 
tion scheme holding for the Toda lattice system (and 
for the related ones) is not available. 

Further knowledge on the intriguing subject of 
"discrete integrable systems" can be acquired by 
looking at the monographs and papers listed in the 
*Further Reading" section. In particular, the excellent 
book by Y B Suris, which also provides an exhaustive 
list of references (updated to 2003), is recommended. 


See also: Billiards in Bounded Convex Domains; 
Boundary Value Problems for Integrable Equations; 
Calogero-Moser-Sutherland Systems of Nonrelativistic 
and Relativistic Type; Integrable Systems and Discrete 
Geometry; Integrable Systems and the Inverse 
Scattering Method; Integrable Systems: Overview; 
Painlevé Equations; Quantum Calogero-Moser Systems; 
Toda Lattices; Yang-Baxter Equations. 
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Historical Overview 


The relevance of algebraic geometry in the theory of 
dynamical systems has a long history. Three models 
may serve as guiding threads from old to the current 
state of the theory. Each time algebraic geometry is 
used to integrate an evolution equation; this is 
achieved by an underlying addition rule. The very 
origin for this seems to be Fagnano’s addition 
rule for the arc of a lemniscate (see Siegel (1969)). 
In analogy to the addition of two arcs on a circle 
x^ + y7=1, or the duplication formula for 


H dr 
arcsin r — —— 
ov1-7 


namely 


[ dr 4 du 
Jo v1 — r? Jo V1 — už 


if r=2uv 1 — i? (a restatement of the trigonometric 
identity r= sin(2x) = 2 sin xcosx), Fagnano found, 
and proved, by substitution, a geometric rule for 
duplicating the arc of a lemniscate: 


x? + 2x^y! Ly = x? y 


The length of the arc is now given by 


[ dr 

s= | — 

Jo v1-r? 

and later Gauss designated the limit of integration 
by r—sinlemn(s). Fagnano was able to show that 


" dr — af du 
0 V1-—7^* o vin 
with the substitution 
> 4u^ (1 — ut) 
(1-2-u*y* 


which is remarkable not only because it doubles the 
length, but also because it does so by rational 
functions, and in fact shows that the arc of the 
lemniscate can be halved by straightedge and compass. 
Gauss showed that the constructible fractions of an arc 
of a lemniscate are the same as the ones for the circle. 

Thanks to subsequent work by Euler, and to the 
theory of abelian functions due to Abel, Jacobi, and 
others in the nineteenth century, we now realize that 
Fagnano's discovery revealed the algebraic group 
structure of the singular quartic curve (or of a 
smooth cubic, if preferred, an elliptic curve). 

This is the key fact that provides the “integration 
by quadratures" for the simple pendulum. We 
follow McKean and Moll (1997) to sketch this 
prototype example of a system which is algebraically 
completely integrable (ACI), defined in the section 
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*Hitchin systems." Newton's law gives the equation 
of motion 6+sin@=0, where 0 parametrizes the 
position of the bob in terms of the angle the 
pendulum makes with the vertical axis, as it rotates 
about its pivot (the length has been normalized so as 
to match the gravitational constant). The energy is a 
first integral, I = cos0 — 1/20?, and the substitution 


2 . 0 


linearizes the motion: 


x 1 
= i ax E 


with &?—(1— I)/2 between 0 and 1, precisely 
because of niet s and Euler's s dddian rule. 

The second striking example of addition rule, 
yielding solutions to a nonlinear partial differential 
equation (PDE), together with this first will provide 
the two themes of this article, and embed into an 
infinite-dimensional family of conservation laws that 
will accommodate the  representation-theoretic 
aspect of the symmetries. In their 1895 article, 
Korteweg and de Vries (KdV) gave official status to 
the (then controversial) representation of solitary 
waves in shallow water: 


Hu, = 6uuy — Uxxx 


(again up to normalization) is by now the well-known 
KdV equation, where u represents the amplitude of the 
wave and x the direction along a canal. It so happens 
that by integrating twice the ordinary differential 
equation (ODE) obtained by the one-wave ansatz, 
z — x — ct (where c is the constant velocity), one sees 
that the solution # and its derivative u, =w satisfy 
identically an algebraic equation: 


—cu' — buu +u” — 0 
(—cu — 3u +u" +a)’ — 0 


2 2 
ul u 
1 M RM a 


5 2 au + b 


u=2o0+ const. (up to a linear transformation) 


(p) 249) — gip — g3 = 4(~ — e1 )(p — ex)( — ex) 


In disguise, then, the PDE and the Hamiltonian 
evolutions are the same; the motion becomes linear 
(and quasiperiodic) on the torus C/A, where A is the 
period lattice of the ø function. It took considerably 
greater effort to generalize this correspondence to 
higher genus. This article is devoted to such a 
correspondence as well as some of the surprising 
connections between complete integrability and 
other areas of mathematics such as: representation 


theory (the corresponding geometric objects are 
Grassmann manifolds as opposed to Jacobians); 
differential algebras (Weyl algebras, commutative 
rings of differential operators, and differential 
Galois theory) and reduction in symplectic 
geometry. 

It is often helpful to highlight the relevant features 
in the simplest example, even if it is of special kind. 
The KdV equation and, as Hamiltonian counterpart, 
Neumann's system (see Neumann (1859)) will serve 
best. The abelian sum identified by Fagnano cannot 
be defined on points of a curve X of genus g > 0; 
what one can add are points of the g-fold symmetric 
product X'&! up to linear equivalence, defining (up to 
noncanonical isomorphism) an abelian variety, the 
Jacobian Jac(X) = C*/A; analytically, the Jacobian is 
described by abelian coordinates 21,...,2%g: if 
01,..., Ag, [51,-.., 0g is a basis of 1-cycles on X 
with standard intersection matrix and wj,..., Wg is 
the dual basis of holomorphic liffesentials then 
LPS NT bs w; is defined in = of a fixed base 
point Po € X and of (P,,... P,) € X® up to the 
period lattice A. It is in these coordinates that the 
Hamiltonian flows become linear. In canonical 


coordinates q1,..-54¢+41;P15---»Pg+1, the harmonic 
oscillator 

di = Di 

Di = —eiqi 


when constrained to the unit sphere har d? has 


equations 


di = Pi 
pi = —eid; + di S (eia; — p7) 
] 

This system is completely integrable in the sense that 
there exist enough involutory invariants, g gener- 
ically (in the (qj, p;) variables) independent functions 
on the 2g-dimensional tangent bundle of the unit 
sphere with canonical symplectic structure; in fact 
the coefficients of the polynomial 


ojja (E)E) 


2 
" i ) 
Aek 


are invariant and the hyperelliptic Riemann sur- 
face X whose model in the affine plane is given by 
u? =f (A) is called the spectral curve of the system. 
Since the polynomial f(A) is monic of degree 
2g--1 and has generically simple roots, X has 


genus g. À change of variables permits integration 
by quadratures, 


ata= [m; 1](0)9 [mi 1](zo — A + 2v —1tU) 
3[0](0)2[0](zo — A+ 2v —1£U) 

where zo, U € C* are constant vectors, 2 denotes the 
Riemann theta function of X, m(k=1,...,2g) are 
theta characteristics and A is the Riemann constant. 
While these are technical objects of classical 
Riemann function theory whose detailed definition 
is best found in a textbook (see, e.g., Mumford 
(1984)), the point here is that the motion is 
linearized along the line with direction U, on the 
hyperelliptic Jacobian Jac(X), which is a 25**!:1 
cover of the phase space. 

A yet deeper fact links the integrable Hamiltonian 
motion and the (soliton) PDE, namely the statement 
that Y **1(ejq? -- p?)=ulti,t3) solves the KdV 
equation, where the variables are renamed as 
x=tı,t=t3 to denote two of the g commuting 
Hamiltonian flows. 

The Neumann system as well allows us to uncover 
another deep relation between dynamics and geo- 
metry, namely the moduli aspect: on the one hand, 
Mumford (1984) used the Neumann system to recover 
the equation of the spectral curve from a vanishing 
property of theta functions with characteristics, 
thereby giving the first characterization of the moduli 
subvariety of hyperelliptic curves in terms of thetanulls 
(for any genus). On the other hand, Francoise (1987) 
explored the relevance of the integrable system to the 
Picard-Fuchs equations. The fundamental link is 
provided by Arnol'd's theory, according to which a 
set of action-angle variables (qj, p;i), 1— 1,...,7, for a 
completely integrable Hamiltonian system can be 
calculated in terms of a basis ^; of the first homology 
of the fibers, which are n-dimensional tori, 
d. dq; =6j; hence, in the case of an algebraically 
integrable system such as the Neumann example (or, 
in Frangoise’s paper, the Kowalevski top), in principle 
one can express the (coefficients of the) differential 
equations satisfied by the periods in terms of the 
commuting Hamiltonians, despite the fact that 
periods and Hamiltonians are transcendental func- 
tions the ones of the others. A more general family of 
period matrices is subject to the Gauss-Manin 
connection, and the question of whether its general 
abelian variety is Lagrangian with respect to a 
holomorphic symplectic structure on the family yields 
a cubic condition on the periods (Donagi and Mark- 
man 1996). 

These are two major applications of PDEs to 
algebraic geometry: characterizing subvarieties of 
moduli spaces (of curves) and expressing the 
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Gauss-Manin connection acting on sections of a 
Hodge-theoretic bundle over the moduli space in 
terms of the evolution equations of a completely 
integrable system. In the former case, the flows of 
the system act on the theta functions of a (fixed) 
curve; in the latter, the Hamiltonians are related, 
via the action variables, to computing the mono- 
dromy over the branch points of the base of the 
system. The generalization of specific (e.g., hyper- 
elliptic) cases is very difficult to work out and 
remains largely open 40 years after the field of 
integrable equations started being actively 
investigated. 

Before concluding this historical overview, a 
beautiful theory that escaped attention is worth 
mentioning. In the late nineteenth century, for 
example, Baker (1907) constructed the first genus-2 
solutions of the KdV equation, although he was 
apparently not aware of the equation itself; in the 
process, he also defined what is known as the Hirota 
bilinear operator, a device introduced by R Hirota 
in the 1970s to capture an equivalent version of the 
KdV, or the more general Kadomtsev-Petviashvili 
(KP) equation, 


(ty — titi, + ys), = yy 


Just as the Lax pair allows for a linearization of the 
isospectral deformations, Hirota's bilinear form 
reveals the representation-theoretic (and algebro- 
geometric) nature of the equations, via the vanishing 
of a natural pairing on a pair of solutions, besides 
providing a formula for exact solutions; the defini- 
tion of the bilinear operation is the following: for 
functions F and G, 


ð Ó 
DFG -( = x.) F(t)G(t )le=e 
E—Uyts...) 


so that Hirota's direct method gives the following 
solution: set u = 2(0^ /Ox7) log F, then 


KdV & (DxD, + D{)F-F=0 
(D$ + 3D? - 4D,D;)F.F 


2 
KP & Dy JE 


0 

Baker was intent on generalizing the properties of 
the Weierstrass p function. He focused on genus 2 
(and obtained partial results for general genus), in 
which case any curve is hyperelliptic, 


f : pr = ME + ay MS + +++ + ap 


and used a suitable basis of holomorphic differen- 
tials particular to the hyperelliptic case, whose 
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integrals give abelian coordinates z; that happen to 
be dual to the KdV flows, 


Au! * dí? Um 


to characterize the genus-2 theta function by 
differential equations (equivalent to the KdV hier- 
archy), as well as give the quartic equation for the 
Kummer surface in P?, namely the 2:1 image of the 
Jacobian of the curve mapped by the divisor 20, 
that is, by a basis of the space of theta functions 
with second-order characteristics, simply as the 
determinant of 


—do 34i 2911 —2912 
ja —(422+4911) F43+2p12 292 
2011 + a3 + 2912 —(a4 + 4922) 2 
—2912 2922 2 0 
where 
log o(z) 


eje) = — Az;0z; 


and the o function, defined in analogy to the genus-1 
case, is proportional to the Riemann theta function. 

To summarize this introduction, the exchange 
between algebraic geometry (the classification of 
algebraic varieties) and dynamical systems has been 
extremely fruitful in either direction: algebraic 
geometry surprisingly provides exact solutions to 
evolution equations that have special algebraic 
symmetries (and arise in nature!), and conversely 
those very evolutionary equations yield the structure 
of particularly complicated varieties, by characteriz- 
ing their (rational) functions. 


Isospectral Deformations 


The isospectral deformations in question have been 
encoded by Lax-pair equations, which take their 
name from Peter D Lax, who gave a version of the 
KdV equation in such form. 

Lax pairs enter in two essentially different ways in 
the theory of integrable systems. The evolution 
equations take the form: O,L-—[B,L], where 
ti, t2,t3,... is a sequence of commuting time flows, 
L is an operator whose coefficients depend on time, 
and B is another operator of the same kind; since 
heuristically this is the infinitesimal version of the 
equation L(z) 2 U(z) ! L(O)JU(t) (with B=U0,U), 
the spectrum of L is preserved and provides 
conserved quantities; in fact, Moser (1980) specu- 
lated that every completely integrable system might 
have such a form. 


In the form that immediately yields a hierarchy of 
PDEs, the (hierarchy of) deformations pertain to a 
ring of (formal) pseudodifferential operators, where 
the variable x —f; is singled out and ð denotes 
differentiation with respect to x: 


L(t)eD- D» uj(x)O!,u; analytic near x = | 


j-0 


The multiplication rule that makes P into a ring (in 
fact, a C-algebra) is composition: 


Qou- uà +u 
0*ou-—D'-uuBü*^^ogu9^?-.. 


We normalize L by an automorphism of D 
(generated by a change of variable and conjugation 
by a function) 


L = 0" us 2(x)0" 7 + --- + ug(x) 


In P any (normalized) L has a unique ath root, 


n—ordL, of the form L=0+u4(x)O'+ 
u 5(x)O 7 +---. Finally, the deformation equations, 
£D. 


define the KP hierarchy, which takes its name from 
the first nontrivial deformation equation, known as 
the KP equation encountered above, if we set 
x= ti, yY =t2,t =t; (notice that this reduces to KdV, 
up to rescaling, when the solution is independent 
of y). The algebro-geometric solutions are those 
with the property that only a finite number of time 
evolutions are independent. This turned out to be 
equivalent to a classical problem of elementary 
differential algebra, known as the Burchnall- 
Chaundy problem after the two co-authors who 
solved it in the 1920s. 

The Burchnall-Chaundy problem: which L(x)'s 
have centralizer Cp(L) that is larger than a 
polynomial ring C[Li], Lj € D? The key to the 
solution is the following fact (which clearly does 
not hold for operators in more than one variable, 
or finite-dimensional operators such as matrices): 
if ord L » 0 and A, B € D both commute with L, 
then [A, B] 20; in particular, Cp(L) is commuta- 
tive, hence every maximal-commutative subalgebra 
of D is a centralizer. It was proved in the early 
1900s by I Schur that Cp(L) 2 (92^. cj£/,c; € C] n 
D. It follows that centralizers are rings of affine 
curves: their transcendence degree over the field of 
coefficients is 1, and SpecC(L) can be regarded as 
an affine curve Xo (with natural compactification 


X by a smooth point at infinity). Burchnall and 
Chaundy proceeded to show that the rings of 
operators whose orders are not all multiples of a 
fixed integer >1, and having the same spectral 
curve X (up to isomorphism), correspond to line 
bundles over X (more precisely, rank-1 torsion- 
free sheaves); thus, the hierarchy of evolutions 
linearizes on Jac X, as indicated by the examples 
treated above. 

In this setting, it has been very challenging to 
generalize the integrable flows, both to the higher- 
rank and to the higher-dimensional case. When all 
the operators in the commutative ring have order 
divisible by an integer r > 1, their common kernel 
defines a rank-r vector bundle over the spectral 
curve, and although the theory in principle is 
similar to the case of line bundles, there are no 
explicit formulas for solution. On the other hand, 
in order that the spectrum be a variety X of 
dimension d » 1 rather than a curve, it is natural to 
seek commutative rings of partial differential 
operators in d variables; but again, while some 
constructions work in principle, explicit formulas 
are elusive. 

The form in which Lax pairs occur for finite- 
dimensional Hamiltonian systems is quite differ- 
ent: here what is preserved is the spectrum of a 
finite-dimensional linear operator, a matrix. The 
first examples, from which the theory took off, 
were inspired guesses. The Neumann system 
described above fits in the following theory: 
Moser (1980) showed that the Neumann system 
together with other important classical examples 
are special cases of rank-2 perturbations (since 
(2— dim(p,q))) which preserve the spectrum of a 
matrix 


L=A+aq®q+bqe2picp®q+dp@p 


where A is a fixed constant matrix which can be 
normalized to a diagonal, diag(ej,...,g+1)s 


a b 
de AES 


and u &v denotes the matrix [u;v;]. The symplectic 
structure is the standard w= ` dp; ^ dq; so that a 
Hamiltonian H defines a flow 


OH ; OH 


1i = 9p; P Bas 


and 


OHOG | 0GOH 
aH HC 0q; Op; ðq; 
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The Hamiltonian flow of 


H => P q) + (b + c)(Bq,p) + d(Bp, p) 


— b; 
z dii — api?) 


(where B —diag(bi,...,b,,1) is any fixed diagonal 
matrix) is equivalent to the Lax-pair equation 
L — [M,L], where M is a suitable matrix: 


1 


M =3 


5 — Ê, 
(b — [bib] + (ad — be) |=" (ai - ip) 


The Weinstein-Aronszajn formula 


det [^ = DE G n) = der( 1, — [En n)1) 
i-i 


(where each of the £(5...,€4 m,..,9, is a 
(g + 1 — 1)-vector) gives for the spectral invariants 


ICA) ian - det(A — L) 
e(À)  det(A— A) 
= det(I — ((A— ay q) & (aq + bp) 
— ((A — A) !p) & (cq + dp)) 
= det(I5 — Wy(q, p)) 
with 
(A—4A) 4,4) (A-A) !q.p) 
W.(q.p) = 
"m" jeu (( ) p.D) 
a b 
ap à 


and det(I — Wy(q,p)) - 1 — tr Wy, + det W,.—1— 
QX(q,p), defining the rational function à. 

Moser also showed that the system is completely 
integrable and linearizes on the (generalized) Jaco- 
bian of the curve ju? — e?(A)óy (x, y). Letting a= — 1, 
b— —c—1,d —0 gives the Neumann system. 

The dilation q àq gives a Lax pair with 
a parameter, A A + Mq&q--A(q&p-p«aq), 
which makes the spectral curve look more natural. 
Indeed, 


Remark (Adler and van Moerbeke 1980). The 
Neumann flow is equivalent to the Lax pair: 
Li -[Mi, Li], where Li =A}? + uq 8 p — p & q) 
q®q and Mi4—Ayu-4-qG p — pq. Moreover, the 
Hamiltonians are of Adler-Kostant-Symes (AKS) 
type, namely projections (with respect to an ad- 
invariant inner product) of gradients of orbit-invariant 
functions to half of the splitting of a Lie algebra. 
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Specifically, oes ~~ Ait! |A; € gli, C) - KG N, with 

K-—(Y5 A;u!) € N-(Y Aj}; if the inner 
product is (A,B) 5, tr A; B;, the dual of N 
can be identified with K = K+, and the Hamiltonian 
for the Neumann flow can be taken to be 
H = ((1/2)(Lip?)’, p^,,1) under the Lie-Poisson 
brackets and suitable reduction. The flows linearize 
on the Jacobian of the (hyperelliptic) curve 
det(L1 — n) = 


It is possible to recover the link between the finite 
and infinite integrable systems (Neumann and KdV) 
mentioned in the introductory overview, if we notice 
that squared eigenfunctions for the Lax operator 
L=L*=&+u become algebraic on the spectral 
curve: Dubrovin et al. (2001) introduced the Baker 
function, namely the unique function v(x, P) with 
the following properties: 


(i) For |x| sufficiently small it is meromorphic on X V 
{P>}, with pole divisor bounded by 6= P, + --- + 
P,, independent of x, such that 5?(6 — Pæ) — 0, 
and near P.w(x,P)e**=1+ O(z !) is holo- 
morphic, with z chosen to be A'/? in our case. 

(ii) We let Q be the unique meromorphic differential 
with zeros on ó and a double pole of the 
form (—\+holomorphic)dz! at P4. Note: 
(1) that Riemann-Roch show that Q is unique. 
(2) We also get a characterization of the dual 
Baker function, defined as w(x,;P) in the 
hyperelliptic case where ; is the involution 
(A,u) — (A, ^u), as meromorphic on X \ {P} 
with poles bounded by 6’ and behavior e**(1 + 
O(z !)) near P, where 6 + 6’ are the 2g zeros of 2. 
(3) Furthermore, Q — dA/W(v, 6), where W is 
the Wronskian (with respect to the variable x). 
Then, upon fixing a meromorphic function Pb, 
normalized at P4,, b = A! -- entire, with g + 1 
fixed poles distinct from 6, we have: 


If pj = Rese hQ, qj = V/pjl(x, ej), Pi 


u(x) and (q;.pj) satisfy the Neumann system. 


Indeed, the constraints follow from the “residue 
theorem” applied to the differential by% (it has 
a residue of —1 at P); the differential equations 
qd;—e;q;— uq; follow from the assumption 
Ly = Av. 

The function u= —2 Heel (Soe egg, evolving 
under suitable abelian flows, is a solution of the 
KdV equation; the “times” of the KdV hierarchy are 
linear combinations of the Neumann Hamiltonians; 
more precisely, of the invariant vector fields deter- 
mined by the tangent directions to the image of X in 


ma JV PIP(X, ej), 
then Detar =1, DE ajp; — 0, D8 (eq? - p2) = 


JacX, with Abel map normalized at P4,, at some 
point P: Dp= 3€ ,A(P)* *D,. 
[he other way around 

McKean-van Moerbeke), 


(Moser-Trubowitz, 


If L—0? -u(x) is a finite-gap operator and 
01,...,€g41 are among the 2g -- 1 edges of the 
gaps, there exist constants pi,...,pg.1 so that 
the functions pj(x)— VPM x,e;) satisfy X" 
p; (x x)=1. Since Lw;-—ej;v;, the pj;(x) solve the 
Neumann system. 


The squared eigenfunctions also provide a natural 
interpretation for Moser's Lax pair. If V, is the 
kernel of L — A, then the Baker function v(x, P) and 
its dual ó(x,P) give a basis of V, except at the 
branch points (e;,0) where w=. But then the 
normalized basis of V, is related to v, ó by a 


constant matrix: 
Yj c|? 
y1 D 


v| |p O flo 
"sj lo ullo 


if B is the differential operator of the Burchnall- 
Chaundy ring corresponding to multiplication by ju, 
so that 


while 


1 FO a 
c=5l ¢ 4 | 
-ó wv 


x=0 


with W=yd' — Y'o. Finally, we calculate: 


c o 1c - s [e 2uJ/ | 
0 -u W| -2y =U + We) 


so that U(A) — vo! + Y'o, V(A) = 24h, W(A) 224 o" 
are polynomials like the entries of W;(q,p) - e*(A), 
and the fact that UW + V^ does not depend on x 
expresses the fact that W — constant. 

An object that links the two distinct occurrences 
of Lax pairs is Sato's infinite-dimensional Grass- 
mann manifold. One particular model will serve as 
illustration, with more general settings covered by 
Dickey (2003). Sato defined a one-to-one correspon- 
dence between cyclic D-submodules Z of P, namely 
of the type Z = DS (which turns out to be equivalent 
to the property: P=Z @ P), and subspaces of a 
ring of formal power series, which make up an 
infinite-dimensional Grassmann manifold, more 


precisely elements of Gr, the “big cell.” This way, 
KP can be viewed as deformation of D modules. 

There are two ways to set up the Grassmannian: 
(1) more direct as a limit of finite-dimensional 
Grassmannians; (2) more intrinsic, using the rings 
DPCP: 


1. Let dimV =m +n= N,Gr(m, V) — (m — frames 
in V}/GL(m)— P( A” V) via £0), ..., £m-D EO A 
E pier Ly 

If we fix a basis eég,...,en-1 of V, and 
write a frame in coordinates, E” = £o jeg ++- + 
£N 1,;£N-1, then 


EO X... neem) — » 


Os E <N 


with £i...1,, , = det(£v.;); o... m—1 


TT 


hdi Clo A ni A Elmi 


A point in the ambient P( A" V) lies in the embedded 
Gr(m,V)«» its projective coordinates £;. ;, ,(0 € 
l; € N) satisfy the Plücker relations (PRs): 


nm 


LE 1 a E m 0 


1—0 


Therefore, 


Gr(m, V) = (Gr(m, V)\{0})/GL(1) 


where 

Gr(m, V) = {(Ey)yca,,, satisfying the PRs} is a line 
bundle over Gr(zz, V), Y is a Young diagram con- 
sisting of rows 


Lui B (m - 1) 


&—-1 


so it is contained in the rectangle Amy. 
For the commutative diagram: 


—— 


Gr(m',N') Pt  Gr(m, N) 
| identity 


Gr(m', Ny ene Gr(m, N) 


| identity 


the following facts can be checked. Let 


m<m,n<n,N' =m +n: 


(i) if (£y) yc A, satisfies PRs, so does its restric- 
tion to Y's within A,,w; 

(ii) if (Ey)yca,, satisfies PRs, so does (Ey)ycA, sy 
where £, — 0 unless Y C Ayn. 
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These facts make it possible to define: Gr— 


(GrN0))/GL(1), where Gr= {(Ey)y all Young diagrams 
satisfying all PRs} 


Gr Prviect Gr(m, N) 


T dense | identity 
Grn embed Gr (ym, N) 
and 
Gr?" = ((£) € Gr: £y = 0 for almost all Y} 
= U Gr(m, N) 
m.N 


The KP time deformations are defined as follows: 


Ey(t):= xv v (t)&v where xy y(t) :— det(py_e,(t)) 


all Y' 


po(t) — 1, p«(t) = + 


Ui -2vj 43v34---—n 


E Ee val...) 


Write xy/g as xy, where xy(t)— det(p, ;(t)) are 
the Schur functions. 
To connect with the KP hierarchy, let 


n Gs: (x T t) 
&p(x + t) 


where x--t-—(x-4&,t,.., and S:=1+w1(x,t) 
ð! +---. Then £—50S ! satisfies the KP hierarchy, 
namely 0, $— B,S — SO", where Br (SOAS e 
[a — B On, — B,]=0—>- 6, L=[(L"),,L£]. 


Note The Plücker coordinate €g(t)= 5 any xv(t) 
£y =7(€,t) is a generating function for the Plücker 
coordinates, £y(t) = xy(0;)&g(t), where 


S (9.18910 
Au Ot, 28h 30t4' — 


Now by reducing to Gr(m, N) and checking that 
every €y(t) satisfies PRs, we have a dynamical 
system on Gr. 


Wy (x,t) := (—1) 


Conclusion (Sato). Although any f(t) € C[[ti, t2, .. . |] 
admits a formal expression of the form `y cyxv(t), 
where the coefficients are 


cy = xyv(O)f (t)|+—0 


it represents the 7 function for some € € Gr <=> its 
coefficients satisfy the following PRs: 


* i Ó, O; 
» (-1) X ko...E, 14; (2) > A A (- 3) TT: T= 0 


i=0 


which is the KP hierarchy in Hirota bilinear form. 
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2. Let 


p | 
V i — te Preonst = ajO'" aj € c] 
Ps LX. 


equipped with the induced filtration V by order, 
induced by 


PX) = | NO a. ay € c] 
—æ<k<i 


and define 


Gr = {vector subspaces W of V 
s.t. dim(W N V) = dim V/(W + VU) < oo) 


"same size" as the reference subspace {>} „<o Cver: 
c, € Cc) 2 V9, 
The correspondence between such a W and a 
cyclic submodule of is given as follows: 
I5Wz-s!vo-iyeV:zvcVv9 
WoI-—(Ae€TP:AW c vn 


Generic points of particular interest in construct- 
ing KP solutions make up the “big cell”: 


Gr) c 


open dense 


Grs V= Wo VX 


<=> & #0 and a 7 function can 


be defined as above 


In standard basis of V,e;:— 0- 'mod Px, i € Z, 
the action 


xe; = (i + Wein 
Oe; = ëj- 


gives V a P-module structure. Let A be the shift 
operator: Oe; — e; 4; then 


ty Art) A? 
5) m eh PP ug 
so, this linearizes the flows! 


This survey would not be complete without an 
example of the formula that links the 7 and the 
theta function; more general statements and groups of 
symmetries can be found in Dickey (2003). A solution 
of the KP hierarchy can be expressed in terms of the 7 
function Tw associated with an element W of Gr(H), in 
the model Gr(H), where H=L*(S'), H = H, 6H. 
with standard basis H,-—(1,2,z2^,..), H= 
(z+, z7,...) and p, the projections, rw(g) = det (pg o 
P+ o ga o (paly) ), where g=e™*. The associated 
Baker function v w(g, z) is a function of the form 


with a; € C[[t1, t2,...]] for each 7, such that the map 
z+ vw(g,z) is an element of g!W. If ¢=1+ 
$5! .ajz,then L=@0¢" is a solution of the KP 


hierarchy. 
1 


Moreover, 
TW ( (ta ) ja 


This is the analog of the expression for the Baker 
function in terms of the theta function, when W 
corresponds to an element of the Jacobian of the 
spectral curve [ via the Krichever map 


w(x, P) =exp(x f n- xa) 


(Ux + A(P) — A(D) — A)8(A(D) + 
0(A(P) — ACD) — AW Ux — A(D) — 


g "vw, C) = 


A 
A) 
where P € I’, A(-) is the Abel map, A the Riemann 
constant, U € C^ a suitable vector, D a generic 
divisor of points P;,...,P, € L',rj a differential of 
the second kind, and a a constant depending on the 
curve. For the KdV solutions, the condition on W € 
Gr" is that z2 W c W and the solution is 


yw (X,15,13,...) = 20logTw(x,t2,t3,...) 


In the Grassmannian formulation, the Hirota 
bilinear operator mentioned in the introductory 
overview makes its third and most general appear- 
ance (we regard Baker's and Hirota's definitions as 
the first two — the one based on a residue formula in 
algebraic geometry, the other on the vanishing of a 
differential form): 


Definitions 


(i) In P, it is possible to conjugate any 
cen tuaka t . into ð by a K=1+ 
i(x)ð ! +---, determined up to elements 
of CIo|— Cp(): K K^CCK-— 
(ii) We define a formal Baker function for £ as the 
element of the module M (the free, ud. 
P- module space of formal expressions f = e" f 
where f = 5N  fj(x)z/, with generator e**) such 
that Ly = zw; so that  — Ke for K as in (i). 
We say that the formal adjoint A! of a (formal 
pseudo) differential operator A — red ux uj(x)O! 
is A! — Y» Cc Oyu;(x), and that the dual 
Baker iq i! to  — Ke"? is the Baker 
function of (L'); the operator which corresponds 
to K in (i) is (Kt) ^, that is, (K') ! £'Ki = —à. 


Then, the KP hierarchy is equivalent to the 
following formula: 


Res,v(t', z)v (1,2) = 0 


— 


(iii 


Moreover, as proved in Dickey (2003), if v4 and v» 
are formal power series of the form 
i1 = Ke"? p —Je?*7. for K,J € 1+ P'”), satis- 
fying the condition 


Res, (a ag "T2 op” v) . Ó = 0 


then there exists an operator £ satisfying the Lax 
equations, whose wave function and adjoint wave 
function are 1, U2, respectively. 

To conclude this overview of Lax equations, we 
point out that they can be viewed as zero-curvature 
condition for a (formal) connection (on the trivial 
bundle over the formal deformation space whose 
fiber is P), rephrasing the fact that the time flows 
commute and hence define time deformations; such 
formulation can be found in Mulase (1984). 


Symplectic Reduction and r Matrices 


While the Lax-pair presentation provides natural 
spectral invariants, the — group/representation- 
theoretic nature of integrability (sometimes referred 
to as hidden symmetries) is best seen in the context 
of Marsden- Weinstein reduction. We perform it in 
the example of a generalization of Moser's rank-2 
perturbation; we extract the basic construction from 
Adams et al. (1988). A more comprehensive treat- 
ment can be found in Babelon et al. (2003). 


Definition We let M,,, denote the space of n xr 
complex matrices, with n > r and give M=M,,, x 
Mn., the symplectic structure w(F,G)=tr(dF ^ dG!) 
for F, G € M. A rank-r perturbation of a fixed n x n 
matrix A is L= A + FG!. 


Definition We split the formal loop algebra 


el(r) = gl(r)' p gl(r) where gl(r) consists of r xr 
matricial polynomials in A and gl(r) of strictly 
negative formal power series. Under the pairing 


(X(A), Y(A)) =tr(X(A)Y(A))_ (where the subscript — 
means the coefficient of \~'), the dual of gl(r)' is 


identified with gl(r) , which therefore admits a Lie- 
Poisson structure. 


In sketch, we consider an action on M whose 
moment map lands in gl(r) ; we check that the 
AKS flows on gl(r) correspond to isospectral 
deformations of L=A+ FG! for flows on Ma; 
finally, we perform a Marsden- Weinstein reduction 
for an (equivariant) GL(r) action to obtain a 
completely integrable system on a symplectic leaf, 
whose flows are linear on the Jacobian of the 
spectral curve. We recall very briefly the general 
definitions. 
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Moment Map 


1. A smooth group action of G on a symplectic 
manifold (M, w) is said to be Hamiltonian if there 
exists a “moment map" J : M — g* such that the 
Hamiltonian vector field associated with / and a 
fixed element € € g is the same as the infinitesi- 
mal action associated with €. However, an 
infinitesimal definition is given because in the 
formal setup the group of a Lie algebra is often 
delicate to define. We recall that: 

2. The Lie-Poisson structure of g* is defined by 


l6. ve (u) =< u, [dé(u), dv(u)] > 
for ġ, y €C*(g"), weg 


where dé: g* — g** (which in our situation will 
always be identified with g) is defined by 


citas S diu dw] p.veg 
dt t=0 

Now we say that J : M — g* is a moment map if 

3. its linear dual j: g—C™(M) is a Lie-algebra 
homomorphism; or if 

4. it is a Poisson map with respect to the Lie—Poisson 
structure: 4, € C™(g*) > U* 6, JU) I1, vhe 
In case we do have a Hamiltonian G-action, then 
the subspace Cc: (M) of G-invariant functions is a Lie 
subalgebra of C* (M). If G acts freely and properly on 
M, then M/G is a manifold with a Poisson structure 
inherited from the one on M through the identifica- 
tion C^ (M/G) = CC (M). The symplectic leaves of 
MIG have the form M, =]~'()/G, =J" (O,)/G, 
where ji € g*, G, is the isotropy group of jj in G and 
O,, is the G-orbit through u. The reduced manifold 
M,, has a natural symplectic structure w, such that 
i*w=7*w,, where i:J ! (u) ^ M is inclusion and 
7:] (u)— M, is the natural projection taking 
points to their G,,-orbits. 


This class of examples can be treated with the 
technique of a (classical) r-matrix, as follows. Given a 
linear map R:g—g, the alternating bilinear form 
[X, Y]; =(1/2)([RX, Y] + [X, RY]) satisfies the Jacobi 
identity < certain quadratic conditions on R are 
satisfied. Assuming they are, for all pairs of invariant 
functions I, J on g*, we have (I, ]}p — 0 (where {, }p 
is the attendant (Lie-Poisson) structure). Indeed, 
i) + (/2)([dI(u), Rdf(u)], 2), but, for example, 
([RdI(u), dJ(u)], ui) = (RdI(u), ad” dJ(u)(u)) = 0. 


Remark As is clear from the proof above, our 
definition of invariant need only be infinitesimal, 
that is, f € I(g*) iff <p,[df(u), X]> = 0 Vu € g*, 
X € g. Of course, when we have a corresponding Lie 
group the invariants are the functions which are 
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invariant under the natural action, such as the 
symmetric functions of the eigenvalues of a matrix. 


AKS Flows 


For a splitting g=K@WN, as given above, with 
g —N*GK', an example of r-matrix is given by 
R(X)— X, — X. (where +,— denote projection to 
K, N) the Jacobi identity is straightforward to 
check. As a consequence, invariants on g' are in 
involution with respect to {,}p and these are called 
AKS flows, after work done independently by AKS: 
X — [df(X),, X] 2 — [df(X). , X], given here for the 
special case in which we can identify K with K* and 
X is the element in K* that corresponds to X c K. 

We now proceed to the appropriate moment maps. 
We generalize the constant matrix A introduced 
above (isospectral deformations) by allowing multiple 
eigenvalues o; of multiplicities n; < r,n4 4- 
nyj-—n, so that det(A — AI) — IP. (a; — A)". Let 
alà) = IÉ.. (a; — A). We split an z x r matrix F 
into k blocks F; accordingly. 


Definition/statement 


(i) JP(F, G)(x15.--5%n) = — Z1 tr(FjX;G7) is the 
moment map of the action [(g1,...,2,) 
(F, G)]; = (Fig;, Gig!), where g; € GLir) so 
that under standard identifications /"(F, G) — 
-(GiFi;, —-— GI Fn) and restricting the action to 
the diagonal subgroup {(g,...,g)}, J, (F, G) = 
GTF, — 

For X(A) € gl(r)* we define a(X(A)) = (X(o1), ..., 
X(o;)) and obtain the exact sequence 


= 
— 
— «© 
— 


— — 


0 — a(A)gl(r)* > gl(r) ^ gr +0 


By dualizing, and identifying g” to its dual by 
using the trace componentwise, we get 


> { 
ÀA 


and finally check that J,=a* o J” is a moment 
map. By combining (i) and (ii), we get a 
moment map 
k T 
JF, G) = 5, 


— SG (A AYP 

uA 
which becomes injective on M/H, where M is 
a suitable open submanifold of M and 
H —GL(m) x --- x GL(z,) acts blockwise by 
(b;F;, b; ! ! G;). 

(iii) We also notice that the “Moser space" M4 = 
(A + FG! |F, G € M} of rank-r perturbations can 
be identified with the orbit space M/G,,G,= 
GL(r) acting as in (i). 


_ To finish, we turn on the obvious AKS flows on 
gl(r) : the key observation is that they are isospec- 
tral for the rank-r perturbation A + FG": we see that 
the Poisson-commutative ring F, of projected 
invariants defines, by composition with /, a 
Poisson-commutative ring F of isospectral flows on 
M,,; X M,,;. 


Hitchin Systems 


The Hitchin system, introduced in the late 1980s, 
20 years later still encompasses the most general class 
of *algebraically completely integrable" systems, which 
we now discuss. In its most basic form, the concept of 
"algebraic completely integrable" (ACI) Hamiltonian 
system, is an extra condition on the integrability of 
classical mechanics, in the following sense. 

A Hamiltonian system with n degrees of freedom, 
that is, defined on a symplectic manifold M of (real) 
dimension 27 is (Arnol'd-Liouville) completely 
integrable if it admits n functions in involution 
whose differentials are linearly independent (possi- 
bly, generically on M). When M is a component of 
the set of real points of an algebraic variety Mc and 
the symplectic form w and Hamiltonian function H 
are rational without poles on M, the concept of 
algebraic complete integrability can be introduced. 
For this condition to hold, we require that the vector 
fields corresponding to the Hamiltonians in involu- 
tion still have no poles on a compactification of the 
fibers on Me. 


Nonexample (Mumford 1984, 84). 
M=R?. w = dx Ady, 


Here a compactification of the fiber, the affine 
curve x*--y*—c, is the projective curve X* + 
Y*=cZ*, which is smooth (provided c 40) and 
has four points at infinity. The vector field Xj 
defined by H, Xy |w= —dH, is tangent to the fiber 
in the affine plane, but has a pole at infinity as can 
be checked by a change of coordinates; 4 is the 
lowest exponent for which this simple nonexample 
works! 


Consider 


H =x" +" 


Note In the algebraically completely integrable 
situation, the fibers are abelian varieties or exten- 
sions of such by C** for some power k. This gives 
rise to the issues of variations of periods over the 
base mentioned in the introductory overview. 

The Neumann system is ACI, with integral tori 
given by the Jacobians of the spectral curves: 


2g4-1 


Psy? =g(d) = [[ (A-4) =UW+ V? 
| 


where 
ibi p? 
3 A — ej 2. A — ej 
L = EtA) e 
qj iDi 
i T 2, A — €; » À— €j 
V U gri 
= x e À = A = ei) 
ee (X) | 
g 
U = [a = Aih (An...,Ag) “elliptical spherical 
i=] 
coordinates” 


U 
AD = (1 y) eigenvector: Ly = pw 


g 
divisor: » Oo V(A;)) 
i=] 

Hitchin (1982) devised a geometrical model of the 
spectral curve, a compact algebraic curve contained in 
the surface T*P', and its line bundles. He also provided 
subsequently (1987) a dramatic generalization. 

Hitchin's construction, in the Neumann-system 
example, highlights the following objects: 


eLcH(P'End(E)& O(g--1), E rank (r=)2 
bundle over P!; 

e T —total space of the line bundle O(g + 1) over P! ; 

e j, —tautological section: Pi — T, where L — fil € 
HVIT, End(E) 8 O(g + 1)) (tildes denote pullback); 

e T: det (L — 5I) —0. The line bundle v (eigenvec- 
tors) is defined as the kernel of L — ñI; and 

e the moduli space of spectral curves is a linear 
system on the surface T. Fixing [e1,...,e5,1] in 
the above example gives constraints that define it 
as subsystem of a complete linear system, as well 
as providing a Poisson structure on the whole 
(2g — 1)-4-g)-dimensional manifold (base = 
curves, fiber — Jacobians) which reduces to the 
standard *` dp; ^dq;. Equivalent to choosing a 
section s € H?(P', Olg — 1) & Ky), 


pU S Hs (61... + Beni) 
ir:1 EEGO0pig-1)eL 
p! (A: 1) e p! 


Generalizations 


e p! — Riemann surface X of genus g > 1; 

e E stable rank-r vector bundle over X. To give 
a concrete example, we will take r=2 and fix 
det E = Oxy. 
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Hitchin’s Abelianization Program 


Fact (Hitchin). Every such bundle E over X can be 
realized as the direct image of a line bundle over a 
spectral curve T X. 


We introduce the moduli space M= 
SU x (2, Ox) =S-equivalence classes of E’s, E semi- 
stable rank-2 bundle over X, detE=QOy. The 
dimension of M is 3g — 3. 

Hitchin (1987) proved that 7T*M is ACI (gener- 
ically, there exist 3g — 3 regular functions in involu- 
tion with respect to the standard symplectic 
structure, with invariant manifolds isomorphic to 
Prym I’, where T = spectral curve). 

To recognize the analog of the features high- 
lighted above, we recall that Kodaira—Spencer 
deformation theory gives the following description 
of the cotangent bundle: since a rank-r vector bundle 
over X is determined by a 1-cocycle with values in 
GL(r, Ox), a first-order deformation of E is given by 
a 1-cocycle with values in the associated bundle of 
Lie algebras, hence by a class in H'(X,End(E)), so 
the cotangent bundle has Serre-dual fiber 
H*(X,E @ E* @ K). 


Hitchin map (E,@) € T*M (Higgs field, trace zero, 
ó € H*(X, Endo(E) & K)): 


H:@++ detó (more generally for any 
tr Ah e A(X, K) 4222,...,7 

u= —p defines Prym T, pu? = det € H9(X,K??) 
defines T. 


yo 


Explicit Hamiltonians for the Hitchin System 


The cases in which X is genus 0 and 1 were solved 
explicitly by Nekrasov (1996) using explicit parame- 
trizations of the moduli spaces; this includes the case of 
insertions (singular curves), yielding (elliptic) Gaudin 
models. We report the solution for the genus-2 case 
(van Geemen and Previato 1996). 


Remark The map H projectivizes, 


H : PH*(X, Endo(E) & K) ^ PH? (X, K®) 
det(cd) = c? det à 


Coordinates on 7 ' M can be given as follows: 
O c Pic ! X = canonical theta divisor 
A: M — [26] =P? 
E Dg = (£ € Pic! X : P (E £) > 0} 


X hyperelliptic = A is 2:1 except for g — 2 (every 
point of M is fixed under the hyperelliptic 
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involution), where M & P?. For a vector space V 
the Euler sequence gives 


PT*PV = I = {(x,b) e PV x PV*:x €b} 
In our case, 
PV x PV* = |26 x |20o| 


Define six polynomial functions H; on P? x P** 
by the requirement: for generic q € P^,(H;—0)n 
PT,P? = f; U &, the six pairs of bitangents to KM 
PT*P”, where K is the Kummer surface (the 
remaining 16 bitangents are cut out by the tropes.) 

Recall that the Grassmannian of lines in 
P^. Gr(2,4), is defined by an equation bes = 
in Klein’s coordinates 


(X4 :...: X) e P? 
X1 = poi + p23, X» = i(po1 — p23) 
X3 = i(po2 — p13), X4 = po + p13 
Xs = pos + pir, X6 = i(pos — p12) 


where p; — Z;W; — W;Z; are Plücker's coordinates 
on the line 


((Zo Aa TI Z3)(Wo TIU W3)) C p? 


Using coordinates on the incidence variety I given 
by the sections ®; of the bundle projection P7 * P? — 
P?, 9j: P? 2 PT*P? =I c P? x P^**,q 4 (9, 6(q)) = 
(q, Xi(q, —)), explicitly given, for q — (x: y:2:t), by 


€j —(y:—x:t:—z), e—(y:—x:—t:z) 
€3 —(z:t:—x:—y, €e4—(z:—t:—x:y) 
es —(t:z:—y:—x), eg = (fog: Y: —x) 


xj = Xj((«i(q),P)) 


Fact For a point q € P^,p € PT;P^,p ¢ e(q), the 
ith Klein coordinate of the line (ej(q),p) is zero and 


2 
j 


p € UG e Hi(p.q) = » >> 
jfi — d 


=0 


with x; = X;((ei(q), p)). 


Conclusion In an affine patch C? x C* > (q,p)— 
((x:y:2z:1),(u:v:w: —(xu+ yv + zw))) 


X;(e(q), p)" 


H?(p,q) = M. Sea A 


IŻ! 
give six Hitchin Hamiltonians, any three of which 
are generically independent. The H? have degree < 4 
in x, y, z and are homogeneous of degree 2 in 
u,v,w; they Poisson-commute with respect to 
dx ^ du + dy ^ dv + dz ^ dw. 


An example is constituted by 

p = (X? -1)0? - 4)0? — 9) 

((x:y:z:1),(u:v:w:-—(xu-r yv zw))) 
e A? x A” 


Example 


Hi —uv(—70xy — 32x?y — 18xy? — 10z — 32x?z 
-18y?z) + v?(—9 — 30y? — 16x^y? — 95* — 32xy? 
— 162?) +u’ (—-16 — 40x? — 16x* — 9x^y? + 18xyz 
— 922) + vw(—18x + 10xy? + 10yz — 32x yz 
— 18y?z — 32xz^) + uw(32y+ 10x^y — 10xz 
— 32x?z — 18xy"z + 18yz^) + w^ (-9x? — 16y" 
+ 10xyz — 16x*z* — 9y^ z^) 


The concept of reduction and r-matrix have been 
generalized to Hitchin systems. Notably, Hitchin later 
showed that the Hamiltonians of the system appear as 
symbols of a heat operator that corresponds to a 
projectively flat connection, the quantization of the 
moduli space of bundles, obtained by changing the 
complex structure of the Riemann surface X. 


Other Aspects 
Special Functions 


Special functions have also been traditionally signifi- 
cant in both algebraic geometry and integrable 
systems. Within the examples presented, elliptic 
functions gave rise to surprisingly sophisticated the- 
ories. The 1-wave solution encountered in the intro- 
duction, u = 24» + const. in the limit when one or both 
periods of the Weierstrass function go to zero, 
becomes exponential or rational, respectively. The 
higher-genus analogs give rise to solitons, or rational 
solutions. On the other hand, the KP solutions which 
are doubly periodic in the x variable (“elliptic 
solitons”) were classified by Krichever (cf. Dubrovin 
et al. (2001)), as forming an ACI Hamiltonian system 
(“elliptic Calogero—Moser”), which, 25 years later, is 
still generating important work, with Hamiltonian 


H-Y p *5 elai- a) 
il izj 
(where © is the Weierstrass function of a lattice L 
with associated elliptic curve X — C/L, q € X the 
origin) and 4—2»7^ , p(x — xi(t2,t3,...)) is a solu- 
tion of the KP hierarchy for suitable time flows t; of 
the system (tı =x) and KP Baker function 


W(x; a) = cio — x) eS (0x) 


c(a) — a(x) 


The associated spectral curves have been classified in 
moduli by Treibich and Verdier (cf. Treibich 
(2001)); Krichever produced a two-field model as 
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well as a universal Poisson structure for the system; 
Donagi and Markman (1996) realized it as a 
generalized Hitchin system. 

More classically, elliptic potentials were the subject 
of much study, in particular by Lamé and Hermite in 
the nineteenth century and Ince in the twentieth; a 
sample result due to Ince makes one feel like Alice in 
Wonderland, who “knelt down and looked along the 
passage into the loveliest garden you ever saw”: the 
Lamé operator L= —0? + a(a+1)o(x — xo) with 
real, smooth potential is finite gap (namely, almost 
all the periodic eigenvalues are double) iff a € Z (if a 
is positive the number of gaps is a). A generalization 
to several variables (due to Chalykh and Veselov), 


L=-A+ M. gap((asx)) 
«cR, 

where R, is the set of positive roots for a simple complex 
Lie algebra of rank n, (—, —) is some scalar product in 
R”, invariant under the action of the Weyl group, and 
go —m,(m, + 1)(a, o) for some m, € Z, provides one 
of the few known examples of quantum completely 
integrable rings of differential operators in several 
variables. Roughly speaking, this means that the 
centralizer of L contains n operators with functionally 
independent symbols, where n is the number of variables. 
What is more, Chalykh et al. (2003) combine 
differential Galois theory and elliptic function 
theory to characterize (under some mild assump- 
tions) the generalized Lamé operators that are 
algebraically completely integrable: the differential 

Galois group of the solutions is abelian. 


Duality, Fourier-Mukai Transform, and Bispectrality 


Duality is a concept imported from mathematical 
physics; as a mathematical phenomenon, it has not 
reached theoretical maturity. First observed in exam- 
ples, as in Fock et al. (2000), where different definitions 
of dual ACI Hamiltonian systems were given (action- 
angle, action—action, and quantum), it resurfaced for 
the Hitchin system, in more than one guise, whether it 
be an interchange of position and momentum variables 
(Gawedzki and Tran-Ngoc-Bich 1998) or a duality 
between the Lagrangian tori that fiber two such 
systems, coming from a Fourier-Mukai transform, 
namely a twist by the (universal) Picard line bundle: 


p 
l 
Jac(X) x (H*(X, K) = T*Jac(X)) 

Notably, the Picard bundle was used by Nakayashiki 
to give a spectacular generalization of the Burchnall- 
Chaundy result for a genus-2 curve X (more generally, 
Jac(X) is replaced by a generic abelian variety in the 
statement): the coordinate ring of Jac(X) — Ox is the 


common spectrum of a ring of commuting (g! x g!) 
matrix partial differential operators in g variables. The 
Fourier transform allowed him to extend Sato's corre- 
spondence O^! — z and give F a unique (free, rank-g!) 
Djac(x)-module structure, where F is a suitable coherent 
sheaf over Jac(X) generalizing the Baker function. 

In this model, the interchange of the x and z 
variables is known as bispectrality (cf. Grünbaum 
(2001)) a somewhat narrower question is a char- 
acterization of the differential operators L in x for 
which there exists a differential operator B in k and 
a common eigenfunction: 


Lv(x, k) = f ()u(x, k) 
B(x, k) = 0(x)u(x, k) 


for some functions f,0, typically polynomial. This 
question proved to be related with the KP hierarchy 
and isomonodromy deformations. When to a hier- 
archy there is associated an ACI Hamiltonian system 
(as in the Neumann case shown above), bispectrality 
may produce a dual system, in a sense related to the 
ones discussed, but somewhat mysteriously so. 


Conclusion 


Many important mathematical topics and individual 
contributions regrettably have to go unmentioned in 
an article of this length. The aim was to illustrate 
by simplest examples the geometric nature of 
integrable systems and equations, in the areas of 
spectral curves, moduli of vector bundles over them, 
Grassmann manifolds, special functions, Poisson 
geometry, representation theory, as well as mention 
constructions that are not yet complete, such as 
spectral varieties of higher dimension, dualities 
sweeping vaster moduli spaces, and quantization. 


See also: Billiards in bounded convex domains; 
O-Approach to Integrable Systems; Functional Equations 
and Integrable Systems; Integrable Systems and 
Discrete Geometry; Integrable Systems and Recursion 
Operators on Symplectic and Jacobi Manifolds; 
Integrable Systems and the Inverse Scattering Method; 
Integrable Systems in Random Matrix Theory; Integrable 
Systems: Overview; Multi-Hamiltonian Systems; 
Recursion Operators in Classical Mechanics; Riemann— 
Hilbert Methods in Integrable Systems; Solitons and Kac- 
Moody Lie Algebras. 
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Introduction 


Although the main subject of this article is the 
connection between integrable discrete systems and 
geometry, we feel obliged to begin with the 
differential part of the relation. 


Classical Differential Geometry 
and Integrable Systems 


The oldest (1840) integrable nonlinear partial 
differential equation recorded in literature is the 
Lame system 


OH, 10HjH, 10H,0H,- 
OujOu,  Hj;Ouy Ou; Hy Ou; Oup 
i,j,k distinct [1] 


1 0Hj0H, |, 
H? Ou; Ou; |. 


.9 ( 1 OH; 5 1 9H, 4. [2] 
Ou, H, Ou, Ou; H; Ou; 


describing orthogonal coordinates in the three- 
dimensional Euclidean space E? (indices i,j,k range 
from 1 to 3). Already in 1869, it was found by 
Ribaucour that the nonlinear Lamé system possesses a 
discrete symmetry enabling to construct, in a linear 
way, new solutions of the system from the old ones. He 
gave also a geometric interpretation of this symmetry 
in terms of certain spheres tangent to the coordinate 
surfaces of the triply orthogonal system. In 1918, 
Bianchi showed that the result of superposition of the 
Ribaucour transformations is, in a certain sense, 
independent of the order of their composition. 

Such properties of a nonlinear equation are 
hallmarks of its integrability, and indeed, the Lamé 
system was solved using soliton techniques in 
1997-98. The above example illustrates the close 
connection between the modern theory of integrable 
partial differential equations and the differential 
geometry of the turn of the nineteenth and twentieth 
centuries. A remarkable property of certain para- 
metrized submanifolds (and then of the correspond- 
ing equations) studied that time is that they allow 
for transformations which exhibit the so-called 
“Bianchi permutability property.” Such transforma- 
tions called, depending on the context, the Darboux, 
Calapso, Christoffel, Bianchi, Backlund, Laplace, 


Koenigs, Moutard, Combescure, Lévy, Goursat, 
Ribaucour, or the fundamental transformation of 
Jonas, can be geometrically described in terms of 
certain families of lines called line congruences. 

In the connection between integrable systems and 
differential geometry, a distinguished role is played 
by the multidimensional conjugate nets, described by 
the Darboux system, which is just the first part [1] of 
the Lamé system with indices ranging form 1 to N > 
3. On the level of integrable systems, this dominant 
role has the following explanation: the Darboux 
system, together with equations describing isoconju- 
gate deformations of the net, forms the multicompo- 
nent Kadomtsev-Petviashvilii (KP) hierarchy, which 
is viewed as a master system of equations in soliton 
theory. In fact, in appropriate variables, the whole 
multicomponent KP hierarchy can be rewritten as an 
infinite system of the Darboux equations. 


Transition to the Discrete Domain 


The recent progress in studying discrete integrable 
systems showed that, in many respects, they should be 
considered as more fundamental than their differential 
counterparts. Consequently, the natural problem of 
extending the geometric interpretation of integrable 
partial differential equations to the discrete domain 
arose, leading not only to the transition to the discrete 
domain of many results on the connection between the 
differential geometry and integrable systems, but also — 
and this seems to be even more important — to the 
description of integrability in a very elementary and 
purely geometric way. 

At the level of integrable equations, the transition 
"from differential to discrete" often makes formulas 
more complicated and longer. On the contrary, at the 
geometric level, in such a transition the properties of 
discrete submanifolds, relevant to their integrability, 
become simpler and moré transparent. Indeed, the 
mathematics necessary to understand the basic ideas of 
the integrable discrete geometry does not exceed the 
"ruler and compass constructions," and many proofs 
can be performed using elementary incidence geometry. 

We will concentrate our attention on the multi- 
dimensional lattice made from planar quadrilaterals, 
which is the discrete analog of a conjugate net. Together 
with the discussion of its properties, which are the core 
of the geometric integrability, we briefly present the 
analytic methods of construction of these lattices and 
we also describe some basic multidimensional integr- 
able reductions of them. Then we discuss integrable 
discrete surfaces; some of them have been found in the 
early period of the “case-by-case” studies. We shall 
however try to present them, from a unifying perspec- 
tive, as reductions of the quadrilateral lattice (OL). 
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Multidimensional Integrable Lattices 
The Quadrilateral Lattice 


An N-dimensional lattice x: ZN — RM is a lattice 
made from planar quadrilaterals, or a quadrilateral 
lattice (QL) in short, if its elementary quadrilaterals 
(x, Tix, T;x, T;T;x] are planar; that is, iff the follow- 
ing system of discrete Laplace equations is satisfied: 


Aj Aix = (T;jA;;) Aix + (TAs Ax, 
ld hi1. i3] 


where Aj:Z^" —R are functions of the discrete 
variable; here T; is the translation operator in the ;th 
direction, and A;=7T;—1 is the corresponding 
difference operator. For simplicity, we work here 
in the affine setting neglecting projective geometric 
aspects of the theory. 


The geometric integrability scheme In the case 
N —2 the definition [3] allows one to uniquely 
construct, given two discrete curves intersecting in à 
common vertex and two functions A1», A21 : — R, 
a quadrilateral surface. For N » 2 the planarity 
constraints [3] are instead compatible if and only if 
the geometric data Aj; satisfy the nonlinear system 


A, Ai = (TpAy)Agn 
= (TjAy, Ay + (TyAg Ax 
i.j,k distinct [4] 


This constraint has a very simple interpretation: in 
building the elementary cube (see Figure 1), the 
seven points x,T;x,T;x, T,x, T;T;x, T;T,x, and 
T;T,x (i,j,k are distinct) determine the eighth point 
T;T;T,x as the unique intersection of three planes in 
the three-dimensional space. 

The connection of this elementary geometric point 
of view with the classical theory of integrable 
systems is transparent: the planarity constraint 
corresponds to the set of linear spectral problems 
[3] and the resulting QL is characterized by the 
nonlinear equations [4], arising as the compatibility 
conditions for such spectral problems. Since the QL 
equations [4] are a master system in the theory of 
integrable equations, planarity can be viewed as the 
elementary geometric root of integrability. The idea 


Figure 1 The geometric integrability scheme. 
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that integrability be associated with the consistency 
of a geometric (and/or algebraic) property when 
increasing. the dimensionality of the system is 
recurrent in the theory of integrable systems. 


Other forms of the Darboux system The i — j 
symmetry of the RHS of eqns [4] implies the 
existence of the potentials H;:Z" — R (the Lamé 
coefficients) such that 

AH; 
H; ' 


and then eqns [4] take the form 


at aH 
jt *1 


A;H 
-(r. t) AH, — 0 i,j,k distinct [6] 


Aj = i Xj 5] 


A, AH; — (7 


] 
H; 


which is the discrete version of the first part [1] of 
the Lamé system. 

The Lamé coefficients allow to define the suitably 
normalized tangent vectors X; : ZN — R" by equations 


Aix = (THX; [7] 
and the functions Qi; sZ —]Rh,iz j, (the rotation 
coefficients) by equations 

AH; = (TiHi)Qi, ij Lj 

Then eqns [3] and [6] can be rewritten in the first- 
order form 

AjX; = (Tj Qi) Xj, 

Ap Oi = (TrQik) Qj, 


The discrete Darboux system [10] implies the 
existence of other potentials p; defined by the 
compatible equations 

Tj pi 
ri = 1 — (TiOj)(T;O;;), 


iz j i9] 
i,j,k distinct [10] 


izj [Wu 


l 
The i — j symmetry of the RHS of eqns [11] implies 
the existence of yet another potential r: Z >R, 


Tt 
pi = 


12) 


T 


which is called the 7-function of the QL. In terms of 
the 7r-function, and of the functions 


nj—TO;j; LF] [13] 
whose geometric interpretation will be given in a 


later section, the discrete Darboux equations take 


the following Hirota-type form: 
(T;Tjr)r = (Tir) Tjr — (Titi) Try, iz] [14] 


(Tet) = (Tet) Ti + (Tein) TH, i,j,k distinct [15] 


Analytic Methods 


We will show how one can construct large classes of 
solutions of the discrete Darboux equations and the 
corresponding QLs using two basic analytical 
methods of the soliton theory: the 0-dressing 
method and the algebro-geometric techniques. 


The 0-dressing method Consider the nonlocal 
O-problem 


lim (x(z) — v(z)) 2 0 [16] 


|z| oc 


where 0= 0/02, R is the integral operator 
(99) = | Rt. zx!) de! ^ dz 
C 


and v(z) is a given rational function of z. 

Let OF € C,i— 1,..., N be pairs of distinct points 
of the complex plane, which define the dependence 
of the kernel R on the discrete variable n € Z^: 


N fi —N Hi 
x Rote.) EE(S— 


=I i 


We consider only kernels Ro(z,z' such that the 
nonlocal 0-problem is uniquely solvable. If x(z; n) is 
the unique solution with the canonical normal- 
ization v= 1, then the function 


N u —N 7 
w(z;n) = x(z;n) ME - = 


i3 1 


satisfies the system of the Laplace equations [3] with 
the Lamé coefficients given by 


run) = ig (Gor) em) 


By construction, the system of such Laplace equa- 
tions is compatible, therefore the Lamé coefficients 
satisfy eqns [6]. To various z-independent measures 
djia on C there correspond coordinates 


x^(n) = [ v(z; n)dus(z) 


of a QL x, having H;(n) as the Lamé coefficients. To 
have real lattices, the kernel Ro, the points O^, and 
the measures dy, should satisfy certain additional 
conditions. 

One can find a similar interpretation of the 
normalized tangent vectors X; and of the rotation 


coefficients Oj. If x;(z; 1) are the unique solutions of 
the nonlocal 0-problem [16] with the normalizations 


-— Q^ —Q Q? — O, 
a) = le. ) IL (S = 2-8) 


then the functions ~;(z;), defined by 


satisfy the direct analog of the linear problem [9], 
Ajvi(z;n) = (TjOg(n))ij(z;n), tA [17] 


= tim | [797 us 
Q;(n) = lim, (C - Si) viGs n) 


Again, by construction, eqns [17] are compatible 
and the functions Q;(n) satisfy the discrete Darboux 
equations |10]. The functions 


where 


X* (n) =| vi(z; n) dus(z) 


are coordinates of the normalized tangent vectors X; 
of the QL x constructed above. 


The algebro-geometric techniques Given a compact 
Riemann surface R of genus g, consider a nonspecial 
divisor D= Y* _ , Pa. Choose N pairs of points Q; € 
R and the normalization point Q4. Given n € Z^, 
there exists a unique Baker-Akhiezer function (2), 
defined as a meromorphic function on R, with the 
FoPONing analytiva properties: (1) as a function of P € 
R\ UN OZ, y(n) may have as singularities only 
simple soles i in the points of the divisor D; (2) in the 
points O* function Y(n) has poles of the order +n;; and 
(3) in the point Q» function (n) is normalized to 1. 
When z7 (P) is a local coordinate on R centered at 
- then condition (2) implies that the function (n) 
in a neighborhood of the point O7 is of the form 


(P; n) = (z? ey" (Soe sll «ey [18] 


The Baker-Akhiezer function, as a function of the 
discrete variable n € Z, satisfies the system of 
Laplace equations [3] with the Lamé coefficients 
H;(n) = e, , Ut). 

Again, by construction, the Lamé coefficients 
satisfy eqns [6]. To various z-independent measures 
djig on R there correspond coordinates 


x^ (n) = | ver; n) du;(P) 


of a QL x. 
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We present the expression of the Baker-Akhiezer 
function and of the 7-function of the QL in terms of 
the Riemann theta functions. Let us choose on the 
canonical basis of cycles (21,...,a5, b1,..., b,] and 
the dual basis {w1,...,wg} of holomorphic differen- 
tials on R, that is, $y wk =. Then the matrix B of 
b-periods defined as Bj, — Ío, w, is symmetric and 
has positively defined imaginary part. Denote by 
wpo the unique differential holomorphic in 
R\{P,Q} with poles of the first order in P, O and 
residues, correspondingly, 1 and -1, which is 
normalized by conditions $, wpo = 0. The Riemann 
function 0(z; B),z € C*, is defined by its Fourier 
expansion 


0(2; B) = H» exp(zi(m, Bm) + 27i(m, z) } 
meZs 


where (-,-) denotes the standard bilinear form in C*. 
Finally, the „Abel map A is given by A(P)= 
(Sp, Wis.. adn. Wg), where Po € R, and the Riemann 
constants vector K is given by 


Kj — A -5 ($ o (P)A;(P)a ), 


kzj 
j = 1, "ee P4 
The explicit form of the vacuum Baker-Akhiezer 


function wy can be written down with the help of the 
theta functions as follows: 


&(A(Q..) + Xa m (A(Q;) - A(O 
6(A(Q,.) + Z) N : 
x baited “ls E — 
where Z — — A(P;) = 
Ea 


Denote by P and s; the constants in the 
decomposition of the abelian integrals near the 


point O7 


Y(n, P) = 


e+ | V 
— 
Sa 
N 
Pe 


j e Am + + + 
, "o; = Fór logz (P) + rj, + O (x (P)) 


J = 
wow, = 
Po i $ 


Then the expression of the 7-function of the QL within 
the subclass of algebro-geometric solutions reads 


—64644 log z (P) + sj; + o(z()) 


T(n) 
cp Qt n, (A 
x IT Xa Iz 


k.j=1 


— A(Q;)) + A(Qx) + z) 
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where 


rf 
w-ew| F j = Kip 


_ 1 €(A(Qj) +Z) —" 
Hk — Au €(A(Q;) +Z) EXP (Sex — Skp) 


Finally, we remark that the geometric integrability 
scheme and the algebro-geometric methods work 
also in the finite fields setting, giving solutions of the 
corresponding integrable cellular automata. 


The Darboux-Type Transformations 


We present the basic ideas and results of the theory 
of the Darboux-type transformations of the multi- 
dimensional QL. 


Line congruences and the fundamental transformation 
To define the transformations we need to define 
first. N-dimensional line congruences (or, simply, 
congruences), which are families of lines in RM 
labeled by points of Z^ with the property that any 
two neighboring lines [ and T;[,;— 1,..., N, are 
coplanar and therefore (eventually in the projective 
extension PM of RM) intersect. 

The QL F(x) is a fundamental transform of the QL 
x if the lines connecting the corresponding points of 
the lattices form a congruence. The superposition of a 
number of fundamental transformations can be 
compactly formulated in the vectorial fundamental 
transformation. The data of the vectorial fundamental 
transformation are: (1) the solution Y;: Z — V, V 
being a linear space, of the linear system [9]; (2) the 
solution Y; : ZN — V*, V* being the dual of V, of the 
linear system [8]. These allow to construct the linear 
operator-valued potential Q(Y,Y*): Z^ > L(V), 
defined by the following analog of eqn [7]: 


AjQ(Y,Y')-Y;9(LY)) i=1,...,N [19] 


Similarly, one defines Q(X, Y*) : Z > L(V, RM) and 
Q(Y, H): Zh =V. The transforms of the lattice x 
and other related functions are given by 
Ff(x)2x—Q(X,Y')O(Y,Y*) 'Q(Y, H) 
F(H;) = H; — Y:Q(Y, Y") 'Q(Y,H), 


ji—1,...,N 
F (X;) = X; - Q(X, Y*)Q(Y, Y") ! Y,, 
i=1,...,N 


(Qj) = Q- Yay, Y") Y; 
Wr Ley dl, BREF 

( = pi(1 + (TY; )£XY, YYY); 
L eee N 


Figure 2 The fundamental transformation as the binary 
transformation. 


Notice that, by the coplanarity of any two neighbor- 
ing lines of the congruence, also the quadrilaterals 
(x, Tix, F(x), F(Tix)} are planar (see Figure 2). Then 
the construction of the transformed lattice mimics 
the geometric integrability scheme. In consequence, 
any quadrilateral 


(x, Filæ), F2(x), F1(72(x)) = F2(F1(x))} 


is planar as well. Therefore, on the discrete level, 
there is no difference between the lattice coordinate 
directions and the fundamental transformation direc- 
tions. The distinction becomes visible in the limit 
from the QL to the conjugate net. Therefore, the 
vectorial description of the superposition of the 
fundamental transformations not only implies their 
permutability but also provides the explanation of the 
validity of the practical rule of “integrable discretiza- 
tion by Darboux transformations." 


The Lévy and Combescure transformations It is 
easy to see that the family t; of lines passing through 
the points x and T;x of a QL forms a congruence, 
called the ith tangent congruence of the lattice. 
When the congruence of the transformation is the 
ith tangent congruence of the lattice x, then the 
corresponding reduction of the fundamental trans- 
formation is called the “Lévy transformation" Z;. 

It turns out that, for a generic congruence |, the lattice 
made from intersection points of the lines [ and T; is a 
QL, called the ith focal lattice of the congruence. When 
the fundamental transform of the lattice x is the ith focal 
lattice of the transformation congruence, then the 
corresponding reduction of the fundamental transfor- 
mation is called the “adjoint Lévy transformation” £;. 

Both Lévy transformations use only a half of the 
fundamental transformation data, and the corre- 
sponding reduction formulas (in the scalar case) for 
the lattice points read as follows: 


£;(x) = x - (Yi) 'Q(Y.H) 
£*(x) =x = A(X, Y) (Y) ' H, 


Notice that the composition of the Lévy and the 
adjoint Lévy transformations gives (see Figure 2) the 
fundamental transformation, also called, for this 
reason, the binary transformation. 

Another reduction of the fundamental transforma- 
tion, important from a technical point of view, is the 
“Combescure transformation,” in which the tangent 
lines of the transformed lattice C(x) are parallel to those 
of the lattice x. The transformation formula reads 


C(x) = x — Q(X, Y*) 


where only the solution Y* of the adjoint linear 
system [8], necessary to build the transformation 
congruence, is needed. 


The Laplace transformations and the geometric 
meaning of the Hirota equation The Laplace 
transform Lj(x),i Æj, of the QL x is the jth focal 
lattice of its ith tangent congruence (see Figure 3). It 
is uniquely determined once the lattice x is given. 
The transformation formulas of the lattice points 
and of the 7-function read as follows: 


Lus) = 0 = i Au [20] 
Aji 
Lilt) = Tij = TQ; i21] 


The superpositions of Laplace transformations 


satisfy the following identities 


Li O Ls = id 
Lip © Lij = Lik 
Lyi © Lij = £y 


which allow to identify them with the Schlesinger 
transformations of the monodromy theory. 

In the simplest case N=2 one obtains the 
so-called Laplace sequence of two-dimensional QLs 


x= y= Eai 
En =bn, bez 


Equations [14] and [21] imply that the 7-functions 
of the Laplace sequence satisfy the celebrated Hirota 
equation (the fully discrete Toda system) 


TTA T27¢ = (T17¢)(T27¢) — (Tq)-1)(T27041) 


TK Tape 


Tx 


Figure 3 The Laplace transformation £j- 
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Distinguished Integrable Reductions 


We will present here basic reductions of the multi- 
dimensional QL. The geometric criterion for their 
integrability is the compatibility with the geometric 
integrability scheme. 


The circular lattices and the Ribaucour congruences 
QLs ZN—E™” for which each quadrilateral is 
inscribed in a circle are called “circular” lattices. 
They are the integrable discrete analogs of submani- 
folds parametrized by curvature coordinates (e.g., 
the orthogonal coordinate systems described by the 
Laméequations [1]-[|2]). 

The integrability of circular lattices is the consequence 
of the fact that if the three “initial” quadrilaterals 
(s i: Tis T4T hs (es Tis Tus, Ty TuS (ae, Ds I gees 
T;T,x) are circular, then also the three new quadri- 
laterals constructed by adding the vertex T;T;T,x 
are circular as well (see Figure 4). In fact, all the 
eight vertices belong to a sphere, and, in consequence, 
all the vertices of any K-dimensional, K —2,...,N, 
elementary cell belong to a (K — 1)-dimensional sphere. 

There are various equivalent algebraic descrip- 
tions of the circular lattices: 


1. the normalized tangent vectors X; satisfy the 
constraint 


iy] 

2. the scalar function x: x: Z" R satisfies the 
Laplace equations [3] of the lattice x; 

3. the functions X? — (x 4- Tix) OXqUZN SR satisfy 
the same linear system [9] as the normalized 
tangent vectors X;; and 


4. the functions X;- X;: Z^ — R satisfy eqns [11] 
and thus can serve as the potentials pj. 


Xi- TiX; TX; . TX; es fJ. 


The Ribaucour transformation is the restriction 
of the fundamental transformation to the class of 
circular lattices such that also the “side” quadrilat- 
erals (x, Tix, R(x), R(T;x)} are circular. Again there 
is no geometric difference between the lattice 
directions and the Ribaucour transformation direc- 
tion. Moreover, the  quadrilaterals — (x, R4 (x), 


Figure 4 The geometric integrability of circular lattices. 
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R(x), Ri(R2(x)) = R3(R4(x))) are circular as well. 
In consequence, the vertices of the elementary K-cells, 
K — 2, ..., N, of the circular lattice and the correspond- 
ing vertices of its Ribaucour transform are contained in 
a K-dimensional sphere. Finally, for K = N, one obtains 
a special Z^ family of N-dimensional spheres, called 
the Ribaucour congruence of spheres. 

Algebraically, the Ribaucour transformation 
needs only a half of the data (necessary to build 
the congruence) of the fundamental transformation. 
The data of the vectorial Ribaucour transformation 
consists of the solution Y; : Z" — V*, of the linear 
system [8]. Then, because of the circularity con- 
straint, Y;: Z — V given by 


Y; = (Q(X, Y") + T;Q(X, Y*))! X; 
is a solution of the linear system [9], and the constraints 


Q(Y, H) + QC, Y*)! = 2Q(X,Y*)"x 
Q(Y,Y*) + Q(Y, Y)! = 2.Q(X, Y')! O(X, Y*) 


are admissible. 

We remark that the above constraints have a simple 
geometric meaning when one considers the circular 
lattices in E™ as the stereographic projections of QLs 
in the Mobius sphere SM; that is, as a special case of 
QLs subjected to quadratic constraints. 


The symmetric lattice Given a QL x with rotation 
coefficients O;; and potentials p; given by [11], then 
the functions Oj, defined by equation 


pjTjOi = piliOs, i7] 
and called, because of their geometric interpretation, 
the backward rotation coefficients, satisfy the 
Darboux system [10] as well. A QL is called 
symmetric if its forward rotation coefficients Oj 
are also its backward rotation coefficients. Again the 
constraint is compatible with the geometric integr- 
ability scheme, that is, it propagates in the construc- 
tion of the lattice. One can show that a QL is 


symmetric if and only if its rotation coefficients 
satisfy the following trilinear constraint: 


(TiO) (Tj; Q¢;) (Te Qik) = (TQ) (Ti Qi) (Te Qix) 


i,j,k distinct 


To obtain the corresponding reduction of the 
fundamental transformation we again need only half 
of the data. Given a solution Y7 : Z^ — V*, of the 
linear system [8], then, because of the symmetric 
constraint, Y;:Z — V, defined by 


Y;—p;( T, Y^)! 


is the solution of the linear system [9]; notice that, 
equivalently, we could start from Y;. The constraint 


Q(Y, Y') 2 Q(Y, Y*)! 


is then admissible and gives a new symmetric lattice. 

There are other multidimensional reductions of 
the QL like, for example, the D-invariant and 
Egorov lattices or discrete versions of immersions 
of spaces of constant negative curvature. We remark 
that the transformations and reductions discussed 
above have also a clear interpretation on the level of 
the analytic methods. 


Integrable Discrete Surfaces 


In this section we present some distinguished examples 
of discrete integrable surfaces. Notice that, although 
the geometric integrability scheme is meaningless for 
N — 2, it can be applied indirectly, by considering the 
discrete surfaces, together with their transformations, 
as sublattices of multidimensional lattices. 

We remark also that one can consider integrable 
evolutions of discrete curves, which give equations 
of the Ablowitz-Ladik hierarchy, and the corre- 
sponding integrable spin chains. 


Discrete Isothermic Nets 


An isothermic lattice is a two-dimensional circular 
lattice x: Z^ — EM with harmonic quadrilaterals; 
that is, given x, Tix and T5x, then the point Tı T»x 
is the intersection of the circle (passing through 
x, Tix and Tx) and the line passing through x and 
the meeting point of the tangents to the circle at Tix 
and T>x (see Figure 5). Therefore, given two discrete 
curves intersecting in the common vertex xo, the 
unique isothermic lattice can be found using the 
above *ruler and compass" construction. 
Algebraically the reduction looks as follows. Any 
oriented plane in EM can be identified with the 
complex plane C. Given any four complex points 
21,22, 23, and 24, their complex cross-ratio is defined by 


. Và — les = uu) 
q(21,22,23,24) = (z2 — z3)(z4 — 21) 


Figure 5 Elementary quadrilaterals of the isothermic lattice. 


One can show that the cross-ratio is real if and only 
if the four points are cocircular or collinear. In 
particular, a harmonic quadrilateral with vertices 
numbered anticlockwise has cross-ratio equal to —1. 
Therefore, abusing the notation (it can be forma- 
lized using Clifford algebras), the isothermic lattice 
is defined by the condition 


q(x, Tix, T1T5x, T2x) = —1 


We remark that the definition of isothermic lattices 
can be slightly generalized allowing for the above 
cross-ratio to be a ratio of two real functions of 
single discrete variables. 

The restriction of the Ribaucour transformation 
to the class of isothermic lattices, named after 
Darboux who constructed it for isothermic surfaces, 
has as its data a real parameter A and the starting 
point D(xo), and can be described as follows. Given 
the elementary quadrilateral (x, Tix, Tox, Tı T»x)] 
of the isothermic lattice, and given the point D(x), 
then the points D(T,x) and D(T;x) belong to the 
corresponding planes and are constructed from 
equations 


q(x, D(x), DU I3x), Tax) = A 
q(x, D(x), D(T»x), T»x) = —A 


It turns out that the point D(Tı Təx), constructed by 
the application of the geometric integrability 
scheme, is such that the quadrilateral (D(x), 
D(Ti1x),D(T»x), D(T;T2x)} is harmonic. Moreover, 
the construction of the Darboux transformation is 
compatible; that is, the new side quadrilaterals have 
the correct cross-ratios À and —A. 

There are various integrable reductions of the 
isothermic lattice, for example, the constant mean 
curvature lattice and the minimal lattice. 


Asymptotic Lattices and Their Reductions 


An asymptotic lattice is a mapping x: Z^ — R? such 
that any point x of the lattice is coplanar with its 
four nearest neighbors T\x,T x,T,'x,T;'x (see 
Figure 6). Such a plane is called the tangent plane 
of the asymptotic lattice in the point x. 

It can be shown that any asymptotic lattice x can 
be recovered from its suitably rescaled normal (to 


Tix 


To'x 


Figure 6 Asymptotic lattices. 
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the tangent plane) field N : Z? — R? via the discrete 
analog of the Lelieuvre formulas 


Aix (TN) xN, — Aox 2 N x (DN) [22] 


By the compatibility of the Lelieuvre formulas, the 
normal field N satisfies the discrete Moutard 
equation 


T1T2N +N = F(T4N + T-N) [23] 


for some potential F: Z^ — R. 
Given a scalar solution 0 of the Moutard equation 


[23], a new solution M(N) of the Moutard 
equation, with the new potential 
— (T10)(120) 
ME) (T1 T3598)0 
can be found via the Moutard transformation 
equations 
0 
MITIN) FN =F 5(M(N)*TIN) [24] 
0 
2 


Now, via the Lelieuvre formulas [22], one can 
construct a new asymptotic lattice M(x)=x+ 
M(N) x N. The lines connecting corresponding points 
of the asymptotic lattices x and M(x) are tangent to 
both lattices. Such a Z?-family of lines in R? is called 
Weingarten (or W for short) congruence. Notice that 
this is not a congruence as considered earlier. 

Various integrable reductions of asymptotic lat- 
tices are known in the literature: pseudospherical 
lattices, asymptotic Bianchi lattices and isothermally 
asymptotic (or Fubini-Ragazzi) lattices, and discrete 
(proper and improper) affine spheres. 

Formally, the Moutard transformation is a reduc- 
tion of the (projective version of the) fundamental 
transformation for the Moutard reduction of the 
Laplace equation. However, the geometric relation 
between asymptotic lattices and QLs is more subtle 
and the geometric scenery of this connection is the line 
geometry of Plücker. Straight lines in R? C P? are 
considered there as points of the so-called Plücker 
quadric Qp C P?. A discrete asymptotic net in P?, 
viewed as the envelope of its tangent planes, corre- 
sponds to a congruence of isotropic lines in Op, whose 
focal lattices represent the asymptotic directions. The 
discrete W-congruences are represented by two- 
dimensional QLs in the Plücker quadric. 


The Koenigs Lattice 


A two-dimensional QL x:Z^- PM is called a 
Koenigs lattice if, for every point x of the lattice, 
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Figure 7 The Koenigs lattice. 


the six points xa,T;xa4,T?x4,i—1,2, of its 
Laplace transforms belong to a conic (see Figure 7). 
The nonlinear constraint in definition of the Koenigs 
lattice can be linearized, with the help of the Pascal 
*mystic hexagon" theorem, to the form that the line 
passing through x and TıTzx, the line passing 
through x; and T?x_, and the line passing through 
xı and T2x, intersect in a point. 

Algebraically, the geometric Koenigs lattice con- 
dition means that the Laplace equation of the lattice 
in homogeneous coordinates x: Z^ — RM*! can be 
gauged into the form 


Ti1T5x +x = Ti (Fx) + T> (Fx) [26] 


It turns out that, if N is a solution of the Moutard 
equation [23], then x=TıN + TN satisfies the 
Koenigs lattice equation. Therefore, the algebraic 
theory of the discrete Koenigs lattice equation [26], 
its (Koenigs) transformation, and the permutability 
of the superpositions of such transformations is 
based on the corresponding theory for the Moutard 
equation [23]. 

Geometrically, the Koenigs lattices are selected 
from the QLs as follows. Given a two-dimensional 
QL x: Z? — P" and given a congruence [ with lines 
passing through the corresponding points of the 
lattice. Denote by y; - T; !(1,;— 1,2, points of the 
focal lattices of the congruence. For every line |, 
denote by 2 the unique projective involution exchan- 
ging y; with T;y;. If, for every congruence |, the 
lattice K(x): Z7 — PM, with points K(x) =2(x), is a 
QL, then the lattice x is a Koenigs lattice. The above 
construction gives also the corresponding reduction 
of the fundamental transformation. 

A distinguished reduction of the Koenigs lattice is 
the quadrilateral Bianchi lattice. The natural con- 
tinuous limit of the corresponding equation is 
equivalent to the Bianchi (or hyperbolic Ernst) 
system describing the interaction of planar gravita- 
tional waves. 


Discrete Two-Dimensional Schródinger Equation 


In the previous sections we have discussed examples 
of integrable discrete geometries described by 
equations of hyperbolic type. Below we present 
some results associated with the elliptic case; it is 
remarkable that the QL provides a way to connect 
these two subjects. 

Consider a solution N : Z? — R? of the general self- 
adjoint five-point scheme on the star of the Z? lattice 


aTıN + Tj (aN) + bT2N + Tj! (QN) —cN=0 [27] 


then the lattice x: Z^ — R? 
Lelieuvre type formulas 


obtained by the 


Aix — —(T;!b)N x T;'N 


| 28 
Ax — (Ti "aN x T;'N 28) 


is a QL having N as normal (to the planes of 
elementary quadrilaterals) vector field. 

The following gauge-equivalent form of eqn 27, 
namely 


D 
tT! mr") — qv —0 [29] 


an integrable discretization of the Schrodinger 
equation 

0 Ov 
Ox?  Oxj 


= Qv—0 


is also the Lax operator associated with an integrable 
generalization of the Toda law to the square lattice. 

The five-point scheme [27] is also a distinguished 
illustrative example of the sublattice theory. Indeed, 
it can be obtained restricting to the even sublattice 
Z? the discrete Cauchy-Riemann equations 


T T9 — 6=iG(T1¢ — Tod) [30] 


Because of the equivalence (on the discrete level!) 
between eqn [30] and the discrete Moutard equation 
[23], the five-point scheme [27] inherits integrability 
properties (Darboux-type transformations, superpo- 
sition formulas, analytic methods of solution) from 
the corresponding (and simpler) integrability proper- 
ties of the discrete Moutard equation. 


See also: Backlund Transformations; 0-Approach to 
Integrable Systems; Integrable Discrete Systems; 
Integrable Systems and Algebraic Geometry; Integrable 
Systems and the Inverse Scattering Method; Integrable 
Systems: Overview; Nonlinear Schrodinger Equations; 
Sine-Gordon Equation; Stability Theory and KAM; Toda 
Lattices. 
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Introduction 


Let (M,w) be a symplectic manifold of dimension 
2n. We denote by # the natural isomorphism 
between T*M and TM, defined by the equation 


i w = —a, a€T'M [1] 


We say that ‘df is the Hamiltonian vector field 
defined by the Hamiltonian f : M — R. 

Associated with the nondegenerated closed 2-form w 
there is also a Poisson bracket on C* (M), the space of 
real differentiable functions on M, defined by 


{.,.},,: C*(M) x C*(M) > C*(M) 
(f.g) (f. gh, = vC df, *dg) 


We say that two smooth functions F, G:M —> R 
are in involution if 


IF; Gin = 0 [2] 


Suppose we have n independent smooth functions 
in involution H,,...,H,, such that the associated 
Hamiltonian vector fields X4, ..., X, are complete 
on the level manifold 


M = {x E M : Aa) 3-1. [3] 
The classical theorem of Arnol' d-Liouville states that 


1. the submanifold M, is invariant with respect to 
each one of the Hamiltonian commuting flows 
generated by Hi, ...,H,; 

2. every connected component of M, is diffeo- 
morphic to a product of a Euclidean space by a 
torus, R"* x TÉ; 

3. there exist coordinates fi, ...,f, 4, q1,..., p in 
M, such that the Hamiltonian systems in M,, 
associated with the Hamiltonians H;, have the form 


f: = C ys = uj, (w = w (a), c = const.) 4] 
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4. if M, is compact then it is diffeomorphic to T” 
and there exists a neighborhood of M, on M, 
symplectically diffeomorphic to B" x T”. 


A completely integrable Hamiltonian system is a 
Hamiltonian vector field X, that admits n integrals 
Hi,...,H, satisfying the hypothesis of Arnol'd- 
Liouville theorem. 

It may happen that a system has more than n 
independent integrals of motion. In this case it is 
called superintegrable and not all the integrals are in 
involution. Supposing that 


M,-—ixcM:H;x)-—24,4—1,...,m--R) 


is compact and connected and that Hj,...,H, , 
commute with all the n + k integrals, then M, is 
diffeomorphic to the torus T" *, In particular, if the 
system is maximally superintegrable, that is, 
k =n — 1, M, is diffeomorphic to T! = S! and all 
the trajectories are closed. 

To prove that a system is completely integrable, we 
have to find a sufficient number of integrals of the 
system in involution. The Lax pair is an extremely 
powerful tool in this task, although it does not 
guarantee the involution of the integrals found. 

A Lax pair of a vector field X on a smooth 
manifold M is a pair of operators (L, M) such that 


* 


L = |M,L| = ML — LM [5] 
This equation is equivalent to 
U^LU-lLS; [6] 


where U is the solution operator of the Cauchy 
problem 


U=MU, U(0)=I [7] 


So, the eigenvalues of L are integrals of X. Notice 
that all the pairs (L*, M), k € N, are Lax pairs of the 
system and we may conclude that the functions 
tr L^, k € N, are integrals of X. 

The first goal of this article is to relate 
integrable Hamiltonian systems and recursion 
operators, where some of the most important 
properties of the latter are exhibited. Very natu- 
rally, the Poisson-Nijenhuis manifolds appear in 
this context and the Toda lattice is the example 
chosen in order to show the whole theory working 
in practice. Also, we see how recursion operators 
can help in the construction of quadratic algebras 
of integrals of motion and, in the last section, we 
present the generalization to Jacobi manifolds of 
the Nijenhuis structures defined for Poisson 
manifolds. 


Integrable Systems on Poisson-Nijenhuis 
Manifolds 


Let X be a vector field on a smooth manifold M. 
A recursion operator of X is a (1,1)-tensor R 
invariant of X: 


£xR =0 [8] 


The (1,1)-tensors, and in particular the recursion 
operators, may be regarded as fiber endomorphisms 
of TM. So, given a (1,1)-tensor R, we denote by 
'R:T'M — T*M the transpose of R: TM — TM, 
that is, 


CR(o),X) = (o, R(X), a€T'M, XeTM [9] 


where (.,.) denotes the canonical pairing between 
T*M and TM. 

Recursion operators also generate symmetries. If R 
is a recursion operator and Y is a symmetry of X, that 
is, [X, Y] = 0, then RY is also a symmetry of X. So, 
given a recursion operator R of X, we may construct a 
sequence of symmetries of X, R^Y,k € N. 

The Nijenhuis torsion of a (1, 1)-tensor R is the 
(1, 2)-tensor 7(R) defined by 


T(R)(X, Y) 2 [RX, RY] — R([X, RY] + [RX, Y] 
-RIX,Y), X,Y € Z(M) (10) 


A Nijenhuis operator is a (1,1)-tensor, R, with 
vanishing Nijenhuis torsion, that is, 


FgxR = R£gR 11] 


These operators can generate sequences of closed 
1-forms. If R is a Nijenhuis operator and a is a 
closed 1-form such that d'R(a) — O0, then 
d'R^(a) = 0,k € N. In the particular case of a 
being exact, that is, a = df and the first cohomol- 
ogy group being trivial, then we have a sequence of 
local integrals of motion df, = 'R^(df). 

A Nijenhuis recursion operator R and a symmetry 
Y of a vector field X lead to a sequence of 
commuting symmetries R‘Y,k € N, 


R'Y,R’Y]=0, ijeN [12] 


To define the integrability in terms of a (1,1)- 
tensor is of special relevance when we try to extend 
everything to the infinite-dimensional case. 

Notice that in coordinates (q1,...,4,), the condi- 
tion [8] is equivalent to 


R — [A, R] [13] 


where A is the n x n matrix defined by 
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and X! = X(q) = dj, j= 1, ...,n. So, the pair 
(R,A) is a local Lax pair of the system and the 
eigenvalues of R are integrals of X. 

If a recursion operator R of a vector field X on a 
manifold M has vanishing Nijenhuis torsion and n 
doubly degenerated eigenvalues Aj, with nowhere- 
vanishing differentials, (dA;), 4 0, then X defines a 
completely integrable Hamiltonian system. 

Now suppose X defines a completely integrable 
Hamiltonian system with Hamiltonian H on a 
symplectic manifold (M,w). Let (h,...,Jn, 1, 
..-,Øn) be the action-angle variables in a neighbor- 
hood of an invariant torus. Two cases may happen: 


1. The Hamiltonian H is separable in the action 
variable, that is, 


H = 3  Hy(I) [14] 
k 
In this case, the (1, 1)-tensor 


o 7) 
Rz A&(I,)| di, @ —+d o x) 15 
3 «( e T, Pk Doi [15] 


where A, are functions with nowhere-vanishing 
differentials, is a recursion operator of X, and has 
vanishing Nijenhuis torsion and doubly degener- 
ated eigenvalues. 

2. The Hamiltonian has nonvanishing Hessian 


OH 


In this case we may define new coordinates 
Vp = =~ E15. . > [17] 
and a new symplectic structure 


ðH 
Wy = ; du, ^ do, = E orar; ^ ^ dei [18] 
J 


The vector field X is Hamiltonian with respect to 
w1, with Hamiltonian 


1 2 
n ^ [19] 


and the (1, 1)-tensor 


o O 
R= Ap (I dv, 00 —— —-- d. L Q9) x) 20 
à; .ao( id ORChé T [20] 


Is a recursion operator of X. 


Nijenhuis operators also allow the construction of 
master symmetries from conformal ones. 

A conformal symmetry of a tensor field T is a 
vector field Z such that 


LzT = AT, for some constant A 


A master symmetry of a vector field X is a vector 
field Y such that 


X, [X,Y] 20, but [X,Y] 40 


Let R be a recursion operator of Xo and Zo be a 
conformal symmetry of X, and R such that 


Éz,Xo = AXo and EZR s [21] 


for some constants À, ju. 

If R is also a Nijenhuis operator, then defining the 
sequences of commuting symmetries X, = R*Xo 
and of conformal symmetries Z, = R*Zo,k € N, 
we have, for all k,j € No, 


LZR = py RF [22] 
[Zk Z;] = WG — k)Zj+k [23] 
[Ze Xj] = (A+ ju) Xe; [24] 


A bi-Hamiltonian manifold is a smooth manifold 
M endowed with two linearly independent Poisson 
tensors Ao, ^1, compatible in the sense that their 
Schouten bracket vanishes, [Ao, ^1] = 0. 

A vector field is said to be bi-Hamiltonian if it is 
Hamiltonian with respect to both Poisson structures. 
The equation that rules the flow of this vector field 
is said to be a bi-Hamiltonian system. 

When one of the Poisson structures is obtained 
from the other by means of a Nijenhuis operator, we 
obtain a Poisson-Nijenhuis manifold. Hence, a 
Poisson-Nijenhuis manifold is a differentiable mani- 
fold M endowed with a Poisson tensor A and a 
(1, 1)-tensor R such that 

RA! = A¥#R, [RA A] = 0 and [RA, RA] — O 

A classical example is the one of a bi-Hamiltonian 
manifold (M, Ao, ^1) where A, is nondegenerated. In 
this case we may define the Nijenhuis operator 
R= A‘ AE" and the manifold M is a Poisson- 
Nijenhuis one. 

The characteristics of the Poisson—Nijenhuis 
manifold guarantee that all the bivectors A, = R*A 
are compatible Poisson tensors and the manifold is 
not just bi-Hamiltonian but multi-Hamiltonian. 

From what we saw, a Hamiltonian system is 
completely integrable if and only if it is bi- Hamiltonian 
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in a neighborhood of an invariant torus with the 
eigenvalues of the existing recursion operator provid- 
ing its complete integrability. These Poisson-Nijenhuis 
manifolds appear quite frequently in dynamics and 
allow us to obtain some interesting properties easily. 
We finish this section with the Toda lattice. This 
system is a good illustration of what has been said until 
now. 

Consider R^"! with coordinates (a1,...,4, 1, 
b,,...,b,) equipped with the following compatible 
Poisson tensors: 


i 3 ð ð 
dei A nia ea x) 2a 


ð 7 
D 3) [26] 


Not only these two Poisson tensors are degener- 
ated but also there is no Nijenhuis operator that 
transforms Ag into A4. This can be seen considering 
the 1-form $77 ,db;. This 1-form belongs to the 
kernel of Ag but not to the kernel of A4. So, the bi- 
Hamiltonian manifold (R?"!,A5,A4,) is not a 
Poisson—Nijenhuis one. 

„Lhe Toda lattice is the bi-Hamiltonian system in 
i7 


X, = Aj(dH1) = Aj (dHo) [27] 
defined by the Hamiltonians 
Ho —-2Y ^b 
E [28] 


n—1 n 
Hi =4) a +20 b 
i=] i-1 


that is, 
a-aba-b, idfisicsn-i 
b, = 24? 
b;-2(a)—a424,), if2<i<n-1 
bn = —2al 


Since we do not have a Nijenhuis operator in this 
setting, we are going to consider a new system in 
R^" that reduces to the Toda lattice, derive a 
hierarchy of Hamiltonians, symmetries, Poisson 
tensors, conformal symmetries and the associated 
relations and then transport everything to R”! by 
reduction. 


Consider the Flaschka transformation 


T: R?” i R21 


(Gti oo sn Does) 9 Ua usan 14 Oty oss) 
where 
qi — di41 1 
di = 5 exp( 2 ) b; Ji 5 Pi 
je lism 1, = Lle [29] 


This application is a Poisson morphism between 


(Re, Ao, Ay) and (R271, Ao, A1 )s where 


n ð O 
ho = — A [30] 
5 Opi ðq; 


-1 
le Tdi x9 
i=] an ^ p; 


+S C- Op; 


2» TA [31] 


i<j 


The Poisson tensor Án is nondegenerated and we 
may define the Nijenhuis operator R = A} AR "s So, 
(R2" NETS is a Poisson-Nijenhuis manitold and 
the bivectors of the sequence (A, — R*Ao), & € N, 
are compatible Poisson tensors. 

The Toda lattice is the reduced bi-Hamiltonian 
system, by means of the Flaschka transformation, of 
the bi-Hamiltonian system 


X; = AL(dHi) = Ai (dHo) [32] 


where 


Ho = X pi 
= 

" ji 2 n—1 

H; — Y 5. » et he 
i=] i=] 


We may define the sequence of commuting 
vector fields X, = R"-! Xi, k € N, and the sequence 
of Hamiltonians dH, — = tRE(dHo), k € N, first inte- 
grals of all the vector fields X; and in involution 
with respect to all the Poisson structures A;. 

.. Moreover, considering the conformal symmetry of 


Ao, ^1, and Ho defined by 


[33] 


= Q -0 &. 0 
iy =) (e+ - Big + > y [34] 


i=] 
, " 2 
we have the following relations on R^": 


z R-R"U [35] 
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Zu, Ze] = (k — m) Z krm i36] 
(Zp, Xmy1] = MX print [37] 
Ly Mm = (m =k = 1) Ag [38] 
Zp-Hm = (m+n+1)Hesm [39] 


Although we do not have a Nijenhuis operator on 
(R7"7!. Ao. A1), the deformation relations [35]-[39], 
obtained for the  Poisson-Nijenhuis manifold 
(R?^, Nos Ay ), may be reduced to the bi-Hamiltonian 
manifold IR" Ao, Ay) by means of the Flaschka 
transformation 7. 


Recursion Operators and Algebras 
of Integrals of Motion 


A master integral of a vector field X is a differenti- 
able function g such that 


Lx£Lxg=0 and Lyg #0 [40] 


So, a master integral g generates an integral of 
motion Lyg of the system X. It is worth noticing that 
if f and g are master integrals, then not only Lyf and 
Cyg are integrals but also (Cxf)g —^f(Cxg) is an 
integral of the system. This means that several master 
integrals may lead to extra integrals of motion. This 
procedure often leads to the construction of the 
integrals which provide the superintegrability of the 
system in consideration. This is the case of, for 
instance, the generalized rational Calogero-Moser 
system or the geodesic flow on the sphere. 

Recursion operators are often used to construct 
sequences of master symmetries of vector fields. The 
obvious connection between master symmetries and 
master integrals carries the recursion operators to 
this level. In many cases, the integrals of motion 
generated by the master integrals constructed on the 
basis of the existence of a recursion operator close in 
a quadratic algebra with respect to the Poisson 
structure we are considering (by quadratic algebra 
we mean that the brackets between the generators 
are polynomials of degree 2. in the generators). 

Let X be a vector field on a manifold M, R a 
Nijenhuis operator which is also a recursion 
operator of X, and P a (1, 1)-tensor such that 


L£xP = a(R) 
and 


LpxR = b(R) 


where a and b are polynomials with constant 
coefficients. The sequences X; = R'X,Y; = R'(PX), 
i€ No, X 4 = Y-t = 0 satisfy 


eae) == g [41] 
[X;, Y; — a(R)X gj -— ib(R)X;,;- 1 [42] 
LYi, Yi] = 9 — iJb(R) Yij-1 [43] 


If (M, A) is a nondegenerated Poisson manifold 
with trivial first cohomology group, RA is a bivector 
and X and Y are Hamiltonian vector fields with 
respect to A and RA, that is, there exist functions 
Ho, H4, Go, and G; satisfying 


X = A (dHi) = RA'(dHo) 
= A'(dG4) = RA*(dGo) 


then the sequences of exact differentials 


! R'(dHi) — dH; and ' R'(dG,) = dG; 

may be constructed. In this case, the functions G; are 
master integrals of all the vector fields X; and the 
integrals X;(G;) and E. = X;(Gz)G; — X;(Gi Gy, 
j,k € No, close in a cuadbatis algebra with respect 
to the Poisson bracket associated with A. 

If M is not a Poisson manifold but we can find a 
master integral G of all the vector fields X; of the 
sequence, then the functions G; — Y;(G) are also 
master integrals of the same vector fields and the 
functions Xi(G;) and Ly; = Xj (Gz)G; = X;(Gj)G, 
are integrals of X;. 

Now let us consider the completely integrable 
bi-Hamiltonian system case. In a neighborhood of 
an invariant torus, a completely integrable 
bi-Hamiltonian system may be written in the form 

A(y1,--+5¥n) 


=nt ty M 


with 


T. 0 o 
: Mw 
= ng ^ a 


the compatible Poisson tensors that provide the 
complete integrability of the bi-Hamiltonian system. 
In this case, we may define the recursion operator 


R= Ys. -Q dy; + EI 
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for which A; = RA, and the bi-Hamiltonian vector 


field 

X = AL (dH) EE (Sm ») 
The (1, 1)-tensor 

P = D (ogg odo n; : Z 94i) 


satisfies CxP — Id and £pxR — 0. So, the vector fields 
O 
Y, = R'(PX) =) yf; — 


and the function G = 77 , y;¢; help defining the 
functions G; = Y;(G),i € No. 
The integrals of X; 


X,(G; and Lj, = X(G))G;- G;X4(G;) [45] 


happen to close in a quadratic algebra with respect 
to the bracket defined by Ao. 


Recursion Operators on Jacobi 
Manifolds 


In this section we extend the notion of Poisson- 
Nijenhuis manifold to the Jacobi setting. 

Let M be a smooth manifold with a bivector field 
A and a vector field E. We equip the space C*(M) 
with the bracket 


{f,g} = A(df,dg) + fE(g) — gE(f) 


which is bilinear and skew-symmetric, and satisfies 
the Jacobi identity if and only if 


[A,A] = —2EAA and [E,A] - O [46] 


When these conditions are satisfied, (M, A, E) is 
called a Jacobi manifold with Jacobi bracket ( , }. 
The pair (C?*(M),L,]) is a local Lie algebra in the 
sense of Kirillov. If the vector field E identically 
vanishes on M, eqns [46] reduce to [A, A] =0 and 
(M, A) is just a Poisson manifold. But there are other 
examples of Jacobi manifolds that are not Poisson, 
for example, contact manifolds. 

We denote by (A, E)” :T*M x R — TM x R the 
vector bundle map associated with (A, E), that is, for 
all a, 8 sections of T*M and f € C*(M), 


(A, E)” (a, f) = (A* (a) + fE, —ira) 
Let R:X(M) x C*(M) — X(M) x CY(M) be a 
C**(M)-linear map defined by 
R(X, f) — (NX - f Y,ixy + gf) [47] 


where N is a tensor field of type (1,1) on M,Y € 
X(M),y € Q! (M) and g € C*(M). Let us denote by 
T(R) the Nijenhuis torsion of R with respect to the 
Lie bracket on X(M) x C*(M) given by 


As in the case of Poisson manifolds, if R has a 
vanishing Nijenhuis torsion, we call R a Nijenhuis 
operator. 

Suppose now that M is equipped with a Jacobi 
structure (Ao, Eg) and a Nijenhuis operator R. Then, 
we may define a bivector field A; and a vector field 
E, on M, by setting 


(A1, E1)* = Ro (Ao, Eo)” 


If one looks for the conditions that imply that the 
pair (A1, E1) defines a new Jacobi structure on M 
compatible with (Ao, Eo), in the sense that (Ao + 
Aj,Eg--E,) is again a Jacobi structure, one 
finds that A, is skew-symmetric if and only if 
R o (Ag, Eo)* = (Ag, Eo)* o*R. When A; is skew- 
symmetric, (A1, E1) defines a Jacobi structure on 
M if and only if, for all (o,f),(8,b)€ 
Q!(M) x C* (M), 


T(R)( (Ao, Eo)* (a. f), (Ao, Eo)* (8,4) 
= R o (Ao, Eo)” (C((Ao, Eo), R)((a, f). (B, b))) 


where C((Ao,Eo),/2) is the Magri concomitant of 
(Ao, Eo) and R. In the case where (Ai, E1) is a Jacobi 
structure, it is compatible with (Ao, Eo) if and only 
if, for all (o, f),(8, b) € Q (M) x C*(M), 


(Ao, Eo)” (C((Ao, Eo), R)((a, f), (B, b))) = 0 


A Jacobi-Nijenhuis manifold (M, (Ao, Eo), R) is a 
Jacobi manifold (M, Ao, Eo) with a Nijenhuis opera- 
tor R such that: (1) R o (Ao, Eo)” = (Ao, Eo)” o'R 
and (2) the map (Ao, Eo)” o C((Ao, Eo), R) identically 
vanishes. R is called the recursion operator of 
(M, (^o, Eo), R). 

A recursion operator on a Jacobi-NiJenhuis mani- 
fold displays a hierarchy of Jacobi-Nijenhuis structures 
on the manifold. In fact, if ((Ag, Eo), R) is a Jacobi- 
Nijenhuis structure on M, there exists a hierarchy 
((Az, E), k € N) of Jacobi structures on M, which are 
pairwise compatible. For all k € IN, (A4, Ej) is the 
Jacobi structure associated with the vector bundle map 
(Ap, Ep)” given by (Ag, Ep)” = R* o (Ao, Eo)*. More- 
over, for all k,l € N, the pair ((A,, Ex), R’) defines a 
Jacobi-Nijenhuis structure on M. 


See also: Bi-Hamiltonian Methods in Soliton Theory; 
Classical r-Matrices, Lie Bialgebras, and Poisson Lie 
Groups; Contact Manifolds; Integrable Systems and 
Algebraic Geometry; Integrable Systems: Overview; 
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Multi-Hamiltonian Systems; Recursion Operators in 
Classical Mechanics. 
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Introduction 


A British experimentalist, JS Russell, first observed 
a soliton in 1834 while riding on horseback beside a 
narrow barge channel. He challenged the theoreti- 
cians of the day “to predict the discovery after it 
happened, that is to give an a priori demonstration 
a posterori.” This work created a controversy 
which, in fact, lasted almost 50 years, and which 
involved such distinguished scientists as Stokes and 
Airy. It was resolved by Korteweg and deVries in 
1895, who derived the KdV equation as an 
approximation to water waves, 


ðq ðq Oq 
RE Ra n d] 


This equation is a nonlinear partial differential 
equation (PDE) of the evolution type, where t and 
x are related to time and space respectively, and 
q(x, t) 1s related to the height of the wave above the 
mean water level. Korteweg and de Vries were able 
to show that equation [1] supports a particular 
solution that exhibits the behavior described by 
Russell. This solution, which was later called 
1-soliton solution, is given by 


ET rM p*/2 
aiii cosh” ((1/2)p(x — p?t) + c) i 
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where p,c are constants. The location of this soliton 
at time £, that is, its maximum position, is given by 
p? —2c/p, its velocity is given by p, and its 
amplitude by p?/2. Thus, faster solitons are higher 
and narrower. It should be noted that qı is a 
traveling-wave solution, that is, q1 depends only on 
the variable X — x — p?t, thus in this case the PDE [1] 


reduces (after integration) to the second-order 
ordinary differential equation (ODE) 
d? 
-pqi(X) +34 (X) + SS (X) = 0 


Under the assumption that q and dq/dX tend to 
zero as |X| — oo, this ODE yields the 1-soliton 
solution [2]. 

The problem of finding a solution describing the 
interaction of two 1-soliton solutions is much more 
difficult and was not addressed by Korteweg and 
deVries. This question was studied by M Kruskal 
and N Zabusky in 1965. Studying numerically the 
interaction of two solutions of the form [2] (i.e., two 
solutions corresponding to two different p, and p2), 
Kruskal and Zabusky discovered the defining prop- 
erty of solitons: after interaction, these waves 
regained exactly the shapes they had before. This 
posed a new challenge to mathematicians, namely to 
explain analytically the interaction properties of 
such coherent waves. In order to resolve this 
challenge one needs to develop a larger class of 
solutions than the 1-soliton solution. We note that 
eqn [1] is nonlinear and no effective method to solve 
such nonlinear equations existed at that time. 

Gardner et al. (1967) not only derived an explicit 
solution describing the interaction of an arbitrary 
number of solitons, but also discovered what was to 
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evolve into a new method of mathematical physics. 
The 2-soliton solution is given by 


2 


2(pie™ a pre”) + 4er th (pı — p2) 
42A15 (p2e?m *™ + piel +n) 


x A 3 
q2( ) (1 +e --en + Aq entm)? | | 
where 
(bi — p3) 
i i= pP 
ny = pix -pit +n ,j = 1,2, Ay = 
(pı + p2) 


and p;, n? are constants. A snapshot of this solution 
with p; — 1,p» —2 1s given in Figure 1. After some 
time the taller soliton will overtake the shorter one 
and the only effect of the interaction will be a *phase 
shift," that is, a change in the position the two 
solitons would have reached without interaction. 
Regarding the general method introduced in 
Gardner et al. (1967), we note that if eqn [1] is 
formulated on the infinite line, then the most interest- 
ing problem is the solution of the initial-value 
problem: given initial data g(x,0)=gqo(x) which 
decay as |x| — oo, find q(x, t). If qo is small and qq. 
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Figure 1 A snapshot of the 2-soliton solution of the KdV equation. 


can be neglected, then eqn [1] becomes linear and 
q(x, t) can be found using the Fourier transform, 


] oo errs ee 
qeu) feb i#Go(k) dk fa 
where 
qo(k) =f | e '* go(x) dx [4b] 


The remarkable discovery of Gardner et al. (1967) 
is that for eqn [1] there exists a “nonlinear analog” of 
the Fourier transform capable of solving the initial- 
value problem even if qo is not small. Although this 
nonlinear Fourier transform cannot in general be 
written in closed form, g(x,t) can be expressed 
through the solution of a linear integral equation, or 
more precisely through the solution of a linear 2 x 2 
matrix Riemann-Hilbert (RH) problem (see the 
section “A nonlinear Fourier transform”). This linear 
integral equation is uniquely specified in terms of 
qo(x). For particular initial data, q(x, t) can be written 
explicitly. For example, if go(x) = g1(x), where g1(x) is 
obtained by evaluating eqn [2] at £—0, then 
q(x,t)=qi(x —p^t). Similarly, if qo(x) — q»(x,0), 
where g2(x,0) is obtained by evaluating eqn [3] at 
t — 0, then q(x,t) = q»(x, t). 

The most important question, both physically and 
mathematically, is the description of the long-time 
behavior of the solution of the initial-value problem 
mentioned above. If the nonlinear term of eqn [1] can 
be neglected, one finds a linear dispersive equation. In 
this case different waves travel with different wave 
speeds, these waves cancel each other out and the 
solution decays to zero as f — oo. Indeed, using 
the stationary-phase method to compute the large 
t behavior of the integral appearing in eqn [4a], 
it can be shown that g(x,t) decays like 0(1/vt) 
as t — oo, x/t —0(1). The situation with the KdV 
equation is more interesting: dispersion is balanced by 
nonlinearity and q(x, t) has a *nontrivial" asymptotic 
behavior as £ — oo. Indeed, using a nonlinear analog 
of the steepest descent method discovered by Deift and 
Zhou (1993) to analyze the RH problem mentioned 
earlier, it can be shown that g(x,t) asymptotes to 
gn(x,t), where gn(x, t) is the exact N-soliton solution. 
This underlines the physical and mathematical sig- 
nificance of solitons: they are the coherent structures 
emerging from any initial data as t— oo. This 
implies that if a nonlinear phenomenon is modeled 
by the KdV equation on the infinite line, then one 
can immediately predict the structure of the solution 
as t — oo, x/t —0(1): it will consist of N ordered 
single solitons, where the highest soliton occurs to 
the right; the number N and the parameters p; and n? 
depend on the particular initial data qo(x). It should 
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be noted that this result can be obtained only using 
the machinery of the theory of integrability, and 
until now cannot be obtained using standard PDE 
techniques. 

So far we have concentrated on the KdV equation. 
However, there exist numerous other equations 
which exhibit similar behavior. Such equations are 
called “integrable” and the method of solving their 
initial-value problem is called the “inverse-scattering” 
or “inverse-spectral” method. 

The following section presents a brief historical 
review of some of the important developments of 
soliton theory. Next, typical solitons, lumps, and 
dromions are given. The inverse-spectral method is 
discussed in the penultimate section. Finally, the 
extension of this method to boundary-value prob- 
lems is briefly discussed. 


Important Analytical Developments in 
Soliton Theory 


Lax (1968) introduced the so-called Lax pair 
formulation of the KdV. In an example, he showed 
that eqn [1] can be written as the compatibility 
condition of the following pair of linear eigenvalue 
equations for the eigenfunction «/(x, t, k): 


V + (q-- k)v —0 [5a] 


V; + (24 — Ak^)u — (qx + v)v = 0, 


where r is an arbitrary constant. The nonlinear 
Fourier transform mentioned earlier can be obtained 
by performing the spectral analysis of eqn [5a]. The 
time evolution of the associated nonlinear Fourier 
data, which are now called spectral data, is linear 
and can be determined using eqn [5b]. Following 
Lax's formulation, Zakharov and Shabat (1972) 
solved the nonlinear Schródinger (NLS) equation 


kec [5b] 


ige + xx —2A|g4 =0, A=+41 6] 


which has ubiquitous physical applications including 
nonlinear optics. Soon thereafter the sine-Gordon 
equation 


Axx — Ft = sin q [7] 
and the modified KdV equation 


qr + 6q^ qs + duxx = 0 [8] 


were solved. Since then, numerous nonlinear equations 
have been solved. Thus, the mathematical technique 
introduced by Gardner et al. (1967) for the solution 
of a particular physical equation gave rise to a new 
method in mathematical physics, the so-called inverse- 
scattering (spectral) method. Among the most 


important equations solved by this method are a 
particular two-dimensional reduction of Einstein’s 
equation and the self-dual Yang—Mills equations. 

The next important development in the analysis of 
integrable equations was the study of the KdV with 
space-periodic initial data. This occurred in the 
mid-1970s in the USA and in the USSR. This method 
involves algebraic-geometric techniques; in particular 
there exists a periodic analog of the N-soliton 
solution which can be expressed in terms of a certain 
Riemann-theta function of genus N. 

In the mid-1970s, it was also realized that there 
exist integrable ODEs. For example, a stationary 
reduction of some of the equations introduced in 
connection with the space-periodic problem men- 
tioned above led to the integration of some classical 
tops. Furthermore, the similarity reduction of some 
of the integrable PDEs led to the classical Painlevé 
equations. For example, letting g=t '/*u(E€), 
€=xt '/ in the modified KdV equation [8], and 
integrating we find 

2 

Gat gera o [9] 
where œ is a constant. This is Painlevé II, that is, the 
second equation in the list of six classical ODEs 
introduced by Painlevé and is his school around 1900. 
These equations are nonlinear analogs of the linear 
special functions such as Airy, Bessel, etc. The connec- 
tion between integrable PDEs and ODEs of the Painlevé 
type was established by Ablowitz and Segur (1977). 
Their work marked a new era in the theory of these 
equations. Indeed, soon thereafter Flaschka and Newell 
(1980) introduced an extension of the inverse-spectral 
method, the so-called isomonodromy method, capable 
of integrating these equations. The most remarkable 
achievement of this new development is the construction 
of nonlinear analogs of the classical connection formulas 
that exist for the linear special functions. These 
formulas, although rather complicated, are as explicit 
as the corresponding linear ones (Fokas et al. 2005). 

It was mentioned earlier that the inverse-spectral 
method gives rise to a matrix RH problem. An RH 
problem involves the determination of a function 
analytic in given sectors of the complex plane, from 
the knowledge of the jumps of this function across the 
boundaries of these sectors. The algebraic-geometric 
method for solving the space-periodic initial-value 
problem can be interpreted as formulating an RH 
problem which can be analyzed using functions defined 
on a Riemann surface. Also, it was noted by Fokas and 
Ablowitz (1983a) and later rigorously established by 
Fokas and Zhou (1992) that the isomonodromy 
method also gives rise to a novel RH problem. This 
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implies the following interesting unification: Self- 
similar, decaying, and periodic initial-value problems 
for integrable evolution equations in one space variable 
lead to the study of the same mathematical object, 
namely to the RH problem. 

Every integrable nonlinear evolution equation in 
one spatial dimension has several integrable versions in 
two spatial dimensions. Two such integrable physical 
generalizations of the Korteweg-deVries equation are 
the so-called Kadomtsev-Petviashvili I (KPI) and II 
(KPII) equations. In the context of water waves, they 
arise in the weakly nonlinear, weakly dispersive, weakly 
two-dimensional limit, and in the case of KPI when 
the surface tension is dominant. The NLS equation also 
has two physical integrable versions known as the 
Davey-Stewartson I (DSI), and II (DSII) equations. They 
can be derived from the classical water-wave problem in 
the shallow-water limit and govern the time evolution of 
the free surface envelope in the weakly nonlinear, 
weakly two-dimensional, nearly monochromatic limit. 
The KP and DS equations have several other physical 
applications. 

A method for solving the Cauchy problem for 
decaying initial data for integrable evolution equations 
in two spatial dimensions emerged in the early 1980s. 
This method is sometimes referred to as the 0 (d-bar) 
method. We recall that the inverse-spectral method 
for solving nonlinear evolution equations on the line 
is based on a matrix RH problem. This problem 
expresses the fact that there exist solutions of the 
associated x-part of the Lax pair which are sectionally 
analytic. Analyticity survives in some multidimen- 
sional problems: it was shown formally by Fokas and 
Ablowitz (1983b) that KPI gives rise to a nonlocal RH 
problem. However, for other multidimensional pro- 
blems, such as the KPII, the underlying eigenfunctions 
are nowhere analytic and the RH problem must be 
replaced by the 0 problem. Actually, a 0 problem had 
already appeared in the work of Beals and Coifman 
(1982) where the RH problem appearing in the analysis 
of one-dimensional systems was considered as a special 
case of a 0 problem. Soon thereafter, it was shown in 
Ablowitz et al. (1983) that KPII required the essential 
use of the 0 problem. The situation for the DS equations 
is analogous to that of the KP equations. 

Multidimensional integral PDEs can support 
localized solutions. Actually there exist two types 
of localized coherent structures associated with 
integrable evolution equations in two spatial vari- 
ables: the “lumps” and the “dromions.” The spectral 
meaning, and therefore the genericity of these 
solutions was established by Fokas and Ablowitz 
(1983b) and Fokas and Santini (1990). 

The analysis of integrable singular integro-differential 
equations and of integrable discrete equations, although 


conceptually similar to the analysis reviewed above, has 
certain novel features. 

The fact that integrable nonlinear equations 
appear in a wide range of physical applications is 
not an accident but a consequence of the fact that 
these equations express a certain physical coherence 
which is natural, at least asymptotically, to a variety 
of nonlinear phenomena. Indeed, Calogero (1991) 
has emphasized that large classes of nonlinear 
evolution PDEs, characterized by a dispersive linear 
part and a largely arbitrary nonlinear part, after 
rescaling yield asymptotically equations (for the 
amplitude modulation) having a universal character. 
These “universal” equations are, therefore, likely to 
appear in many physical applications. Many integr- 
able equations are precisely these “universal” models. 


Solitons, Lumps, and Dromions 


Solitons, lumps, and dromions, are important not 
because they are exact solutions, but because they 
characterize the long-time behavior of integrable 
evolution equations in one and two space dimen- 
sions. The question of solving the initial-value 
problem of a given integrable PDE, and then 
extracting the long-time behavior of the solution is 
quite complicated. It involves spectral analysis, the 
formulation of either an RH problem or of a 0 
problem, and rigorous asymptotic techniques. On 
the other hand, having established the importance of 
solitons, lumps, and dromions, it is natural to 
develop methods for obtaining these particular 
solutions directly, avoiding the difficult approaches 
of spectral theory. There exist several such direct 
methods, including the so-called Backlund transfor- 
mations, the dressing method of Zakharov-Shabat, 
the direct linearizing method of Fokas-Ablowitz, 
and the bilinear approach of Hirota. 


Solitons 


Using the bilinear approach, multisoliton solutions 
for a large class of integrable nonlinear PDEs in 
one space dimension are given in Hietarinta 
(2002). Here we only note that the 1-soliton 
solution of the NLS [6], of the sine-Gordon [7], 
and of the modified KdV equation [8] are given, 
respectively, by 


ppre' Pt (p -p7 )t+n) 


cosh|pr(x — 2pit) + nl Uu 


q(x,t)— 


q(px + qt) =4 arctan[e"**4^"]. p^—1--q* [11] 
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qe pn cosh[px — p2t + n a 
where pn, pi, n, p, q are real constants. 
Lumps 
The KPI equation is 

Ox|dr + 6qdx + dxxx] = 3qyy [13] 
The 1-lump solution of this equation is given by 

a2 y, À 
q(x,y,t) = 20% In ||L(x, y, t)| tax i 


L =x = 2y + 12At +a 
ÀA = Ar +i, A >O 


where A and a are complex constants. 
The focusing DSII equation is 


ld:  d« + qz — 24 (a a +az'lak)=0 [15] 


where z= x + iy, and the operator 07! is defined by 


Cx f (GC) 


i 21 R? C-z 


dC ^ dc 


The 1-lump solution of this equation is given by 


[16] 


Beilb +p )t+pz-pz 


xz, z= ———————_.———7 
lz + a+ 2ipt|* + 18 


where a, 3, p are complex constants. A typical 
1-lump solution is depicted in Figure 2. 


abs u 
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Figure 2 A typical 1-lump solution. 


Dromions 


The DSI equation is 
iq; + (a +&)q+qu = 0 
us, = 2( 8} +03 |a 


The 1-dromion solution of this equation is given by 


[17] 


Pe 
X. N 3 o 
i 18] 
X-—px-rip't, Y=qy+iqt 


lo^ = 4prqr(aB — 76) 


where p,q are complex constants and a, 3,7, 6 are 
positive constants. 


A Nonlinear Fourier Transform 


The solution of the initial-value problem of an 
integrable nonlinear evolution equation on the 
infinite line is based on the spectral analysis of the 
x-part of the Lax pair. Thus, for the KdV equation 
one must analyze eqn [5a]. This equation is the 
famous time-independent Schrédinger equation. We 
now give a physical interpretation of the relevant 
spectral analysis. Let KdV describe the propagation 
of a water wave and suppose that this wave is frozen 
at a given instant of time. By bombarding this water 
wave with quantum particles, one can reconstruct its 
shape from knowledge of how these particles 
scatter. In other words, the scattering data provide 
an alternative description of the wave at fixed time. 


abs u 
t=—0:35 
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The mathematical expression of this description 
takes the form of a linear integral equation found 
by Faddeev (the so-called Gel'fand-Levitan-March- 
enko equation) or equivalently the form of a 2 x 2 
matrix RH problem uniquely specified by the 
scattering data. This alternative description of the 
shape of the wave will be useful if the evolution of 
the scattering data is simple. This is indeed the case, 
namely using eqn [5b], it can be shown that the 
scattering data evolve linearly. Thus, this highly 
nontrivial change of variables from the physical to 
scattering space provides a linearization of the KdV 
equation. 

In what follows we will describe some of the 
relevant mathematical formulas. We first 
“assume” that there exists a real solution q(x, t) 
of the initial-value problem which has sufficient 
smoothness and which decays for all t as |x| — oo. 
We then discuss how this assumption can be 
eliminated. 

As it was mentioned earlier most of the analysis 
of the inverse-scattering transform is carried out 
on the x-part of the Lax pair, that is, on eqn [5a]. 
Hence, we first concentrate on eqn [Sa] and for 
convenience of notation we suppress the time 
dependence. 


The Direct Problem 


As |x| — oo, q — 0, thus there exist solutions of eqn 
[Sa] which tend to exp[+ikx] as |x| — oo. Let 
w(k,x) and w(k,x) denote solutions of eqn [Sa] 
with the following asymptotic property: 

yo gl ij +e tk asx—oo RER [19] 
Under the transformation k — —k, eqn [5a] remains 
invariant and the boundary condition for v is mapped 
to the boundary condition for v». Hence 


w(k, x) = v(—h, x) [20] 


We denote by ¢(k,x) the solution of eqn [5a] which 
tends to exp|—ikx] as x — —oo, 

dae, asx—-—oo, RER [21] 
It is more convenient to work with eigenfunctions 
(i.e., solutions of [5a]) normalized to unity as x — oo, 
thus we introduce M(k,x) and N(k,x) as follows: 

M-$e*, N=" [22] 

The functions M and N can be expressed in terms of 
q through the solution of linear Volterra integral 
equations. Indeed, M satisfies 


M — 21k M, = —qM, 
M — 1, 


keR 


x — —oo [23] 


The homogeneous version of [23] has solutions 1 
and e?*, Thus, 


M = cı + caen + Mp [24] 
where c1,c2 are constants and M, is given by 
Mp = ui (x) + ux (x)e?^* [25] 


The functions “1, u2 satisfy 


2ikx,,! nm 0, 


Wd etu, = Zik u, = —qM 


Thus, 


ui( x) -Af d£q(£) k, £), 


[26] 
a —2iké 
m(x) = -zg | dee aM. 
Substituting [25] and [26] into [24] and using the 
boundary condition [23], we find 


M(k, x) 
-1+ / © de(-1-+ e" 9a) M(e) [27 


Similarly, one may establish that N satisfied 
N(R, x) 
=14+5 d de(—1 + eE- (EN (k, £) [28] 


The kernel of eqn [27], as a function of k, is 
bounded and analytic for Imk > 0. Thus, if q € 
Li,M(k,x) as a function of k is holomorphic for 
Imk > 0. Similarly, N(k,x) as a function of k is 
holomorphic for Im k > 0. 

Thus, we have found particular solutions of eqn 
[Sa] which are holomorphic for Im k > 0. Further- 
more, these solutions are simply related for k real. 
Indeed, the linear independence of solutions of the 
second-order ODE [5a] implies 

ó(k,x) =alkjýlk, x) + b(ku(k,x), RER 
Using [20] and replacing ¢ and % in terms of M and 
N, we find 


x = N(—k, x) + p(k)e? *N(k, x) 
p(k) = keR [29] 
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The functions a(k) and b(k) are given by 


i-i]. déq(£)M(k, £), 


bk) = sr deq(eyM(k, Qe? 


Indeed as x — œ, N — 1, thus, eqn [29] implies 


kRER 
[30] 
keR 


M —> a(k) + b(k)e?* as x — oo [31] 


On the other hand, eqn [27] implies that 


M— 1+ x]. d£(—1 + e? 9a(£)M(k, €)) 
x — oo [32] 


Comparing eqns [31] and [32], we find eqns [30]. 
The expression for a(k) implies that this function 
is also holomorphic for Imk > 0. 
In summary, in the “direct problem,” we have 
found particular solutions of eqn [Sa] which are 
sectionally holomorphic: 


M(k, x) M(—k, x) 
Ud] amd (ire 
are holomorphic for Im k > 0 and Imk < 0, respec- 
tively. These solutions, which are characterized in 


terms of g by eqns [27] and [28], are simply related 
by eqn [29]. 


The Inverse Problem 


Equation [28] expresses N in terms of q. Is it possible 
to find an alternative expression for N in terms of 
some appropriate “spectral data"? The answer is 
positive and is a direct consequence of the fact that 
eqn [29] defines the “jump condition" of an RH 
problem. Indeed, it can be shown that a(k) may have 
simple zeros k,,...,, in the positive imaginary axis 
of the k-complex plane. Hence, in general, M/a can 
be expressed in the form 


M(k,x) | =. A;(x) 
a(k) "MEA 2 Pn 


p; > 9 


where M(k,x) as a function of k is holomorphic for 
Imk>0. It can also be shown that Aj(x)=C 
exp[-2p;, x]N(k;, x). Hence eqn [29] becomes 


M(k,x) — N(—k,x) 
Sh Gje ?P* N(ip;,x) 


| 2ikx 
= 4 bp, + p(k)e“*N(k, x), 


kcR 


Taking the (—) projection of this equation, and 
using the fact that both M and N tend to 1 as k — oc, 
we find 


1 f? dlg(D)e? *N(, x) 
Né ai feo 


eer... 
=1-) 7 jp, NGP) [33] 


In summary, this equation expressed N(k,x) in 
terms of the scattering data (p(k), (C;, p;)1). 

Since both eqns [28] and [33] are associated with 
the same g, these equations can be used to obtain 
the following expression for q: 


* - 2ilx 
E / _ dlp(D Nx) 


—1 3 Ge Ni) [34] 


j=1 


Indeed, eqn [28] implies 


jim N(k, x) =1-3/° d£q(£) 


Comparing this expression with the large-k behavior 
of eqn [33], we find [34]. 


Time Dependence of the Scattering Data 


We now use eqn [Sb] to compute the time 
dependence of the scattering data by evaluating 
eqn [Sb] as x —^—oo we find v=4ik?. Then, 
evaluating it as x — oo and using 


à ~ ae + elk, ap ees 


we find 
a, = 0, b, = 8ik’b 


Hence, 


a(t,k)=a(0,k),  p(t,k) = p(0,k)e9^* — [35] 


Thus, 


pi(t)=pj(0), Gt) =C(O)e*”* [36] 


The above formal results motivate the follow- 
ing definitions (for simplicity, we assume that a(k) 
has no zeros). Given a decaying real function 
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qo(x), x € R, define Mo(k,x) as the solution of the 
linear Volterra integral equation 


Mo(k,x) - 1455 | déc 1 eh at Mo £) 
Imk » 0 


Given Mo(k,x), define ao(k) and bo(k) by 


Mo(k, x) — ao(k) + bo(k)e2**, x00, RER 
Given dp and bo, define N(k, x,t) by the solution of 


the linear integral equation 


1 bo(l) sips ous N(L x, t) — 
N(k xt) — 5 f “a Tet 


A theorem of Gohberg and Krein implies that this 
equation has a unique global solution. Given 
ao, bo, N, define g(x, t) by 


189 f” 
qe.) ra d 


Then it can be shown that g(x,t) satisfies the KdV 
equation and q(x, 0) — qo(x). 


bo(k) 
ao(k) 


edit dike N h, x, t) 


A Unification 


After the emergence of a method for solving the 
initial-value problem for nonlinear integrable evolu- 
tion equations in one and two space variables, the 
most outstanding open problem in the analysis of 
these equations became the solution of initial 
boundary-value problems. A general approach for 
solving such problems for evolution equations in one 
space dimension was provided by Fokas (1997). 
This approach has already been used for the study of 
nonlinear integrable evolution PDEs on the half-line 
(Fokas 2002, 2005), on the interval, and in a time- 
dependent domain. An important advantage of this 
new method is that it yields the formulation of a 
matrix RH problem (or a 9 problem in the case of a 
convex time-dependent domain), which although has 
more complicated jump matrices than the analogous 
problem on the infinite line, it still has an explicit 
exponential (x, t) dependence. This fact allows one to 
describe effectively the asymptotic properties of the 
solution, using the powerful Deift-Zhou method 
(Deift and Zhou 1993). For example, the long-time 
asymptotics of boundary-value problems on the half 
line are discussed in Fokas and Its (1996). 

It is remarkable that the above results have 
motivated the discovery of a new method for solving 


boundary-value problems, not only for linear evolu- 
tion PDEs, but also for linear elliptic PDEs in two 
dimensions. This includes the Laplace, the biharmonic 
and the Helmholtz equations in a convex polygon 
(Dassios and Fokas 2005). In a most recent develop- 
ment, this method has also been applied to certain 
classes of linear PDEs with variable coefficients. This 
highly unexpected development unifies and extends 
several classical branches of mathematics. In particu- 
lar, it unifies the classical transform methods for 
simple linear PDEs as well as the method of images, 
the treatment of linear PDEs via certain ingenious 
techniques such as the Wiener-Hopf technique, the 
formulation of Ehrenpreis type integral representa- 
tions, and the solution of integrable nonlinear PDEs 
via the inverse-scattering transform. Furthermore, it 
extends these results to arbitrary domains and to 
certain classes of PDEs with variable coefficients. 

Regarding linear equations we note the following: 

Almost as soon as linear two-dimensional PDEs 
made their appearance, d'Alembert and Euler discov- 
ered a general approach for constructing large classes 
of their solutions. This approach involved separating 
variables and superimposing solutions of the resulting 
ODEs. The method of separation of variables natu- 
rally led to the solution of PDEs by a transform pair. 
The prototypical such pair is the direct and the inverse 
Fourier transforms; variations of this fundamental 
transform include the Laplace, Mellin, sine, cosine 
transforms, and their discrete analogs. 

The proper transform for a given boundary-value 
problem is specified by the PDE, by the domain, and 
by the given boundary conditions. For some simple 
boundary-value problems, there exists an algorithmic 
procedure for deriving the associated transform. This 
procedure involves constructing the Green's function 
of a single eigenvalue equation, and integrating this 
Green's function in the k-complex plane, where 
k denotes the eigenvalue. 

The transform method has been enormously 
successful for solving a great variety of initial- and 
boundary-value problems. However, for sufficiently 
complicated problems the classical transform method 
fails. For example, there does not exist a proper analog 
of the sine transform for solving a third-order evolution 
equation on the half-line. Similarly, there do not exist 
proper transforms for solving boundary-value pro- 
blems for elliptic equations even of second order and in 
simple domains. The failure of the transform method 
led to the development of several ingenious but 
ad hoc techniques, which include: conformal mappings 
for the Laplace and the biharmonic equations; the 
Jones method and the formulation of the Wiener-Hopf 
factorization problem; the use of some integral 
representation, such as that of Sommerfeld; the 
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formulation of a difference equation, such as the 
Malyuzhinet's equation. The use of these techniques 
has led to the solution of several classical problems in 
acoustics, diffraction, electromagnetism, fluid 
mechanics, etc. The Wiener-Hopf technique played a 
central role in the solution of many of these problems. 

A crucial role in the new method is played by the 
global equation satisfied by the boundary values of q 
and of its derivatives. For evolution equations and for 
elliptic equations with simple boundary conditions, this 
involves the solution of a system of algebraic equations, 
while for elliptic equations with arbitrary boundary 
conditions, it involves the solution of an RH problem. 
For simple polygons, this RH problem is formulated on 
the infinite line, thus it is equivalent to a Wiener-Hopf 
problem. This explains the central role played by the 
Wiener-Hopf technique in many earlier works. 

For linear PDEs, the explicit x1, x» dependence of 
q(x1, x2) is consistent with the Ehrenpreis formulation 
of the solution. Thus, this method provides the 
concrete implementation as well as the generalization 
to concave domains of this fundamental principle. For 
nonlinear equations, it provides the extension of the 
Ehrenpreis principle to integrable nonlinear PDEs. 


See also: Boundary value Problems for Integrable 
Equations; 0-Approach to Integrable Systems; Integrable 
Systems and Algebraic Geometry; Integrable Discrete 
Systems; Integrable Systems and Discrete Geometry; 
Integrable Systems in Random Matrix Theory; Integrable 
Systems: Overview; Korteweg-de Vries Equation and 
Other Modulation Equations; Partial Differential 
Equations: Some Examples; Riemann-Hilbert Methods in 
Integrable Systems; Sine-Gordon Equation; Toda 
lattices; Twistor Theory: Some Applications [in Integrable 
Systems, Complex Geometry and String Theory]. 
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Random Matrix Models 


A random matrix model is a probability space 
(Q,P,F) where the sample space Q is a set of 
matrices. There are three classic finite N random 
matrix models (see, e.g., Mehta (1991)): 


1. Gaussian orthogonal ensemble (58 — 1): 

(a) Q=N x N real symmetric matrices; 

(b) P= “unique” measure that is invariant under 
orthogonal transformations and the matrix 
elements are i.i.d. random variables; expli- 
citly, the density is 


cn exp(—tr(A*)) dA [1] 


where cw is a normalization constant and 

dA = [T; dA; [T,.; dA;j, the product Lebesgue 

measure on the independent matrix elements. 
2. Gaussian unitary ensemble (= 2): 

(a) Q— N x N Hermitian matrices; 

(b) P= “unique” measure that is invariant 
under unitary transformations and the (inde- 
pendent) real and imaginary matrix elements 
are i.i.d. random variables; and 

3. Gaussian symplectic ensemble (8 — 4) (see Mehta 

(1991) for a definition). 


Generally speaking, the interest lies in the 
N — oo limit of these models. Here we concentrate 
on one aspect of this limit. In all three models the 
eigenvalues, which are random variables, are real 
and with probability 1 they are distinct. If Amax(A) 
denotes the largest eigenvalue of the random 
matrix A, then for each of the three Gaussian 
ensembles we introduce the corresponding distri- 
bution function 


Fw a(t) i= Pa(Amax < Lj. E 1,2,4 


Integrable Systems in Random Matrix Theory 


The basic limit laws (see Tracy and Widom 
(1996) and references therein) state that 


OS 
Fa(s) = lim Fus(2eVN-k- s). 8-124 [2 
exist and are given explicitly by 


Fz (s) = det(1 ay K Airy) 


-ew(- f œ- 9&6) 


_ Ai(x)Ai (y) — Ai (x)Ai(y) 
Xr- y 
acting on L^(s, oc)(Airy kernel) 


and q is the unique solution to the Painlevé II 
equation 


q' =sq+2¢q° 
satisfying the condition 


q(s) ~ Ai(s) 


c in eqn [2] is the standard deviation of the 
Gaussian distribution on the off-diagonal matrix 
elements. For the normalization we have chosen 
o = 1/v2; however, for subsequent comparisons, the 
normalization ø= WN is perhaps more natural. 

The orthogonal and symplectic distribution func- 
tions are 


as S — OO 


Fils) =exp(—5 [^ ale) dx) ts? 
Fas 3) — cosh G [ aco dx) (F,(s)) 


Graphs of the densities dF5;/ds are in the adjacent 
figure and some statistics of Fz can be found in 
Figure 1. 

The Airy kernel is an example of an integrable 
integral operator and a general theory is developed in 
Tracy and Widom (1994). A vertex operator approach 
to these distributions (and many other closely related 
distribution functions in random matrix theory) was 
initiated by Adler, Shiota, and van Moerbeke (see the 
review article var Moerbeke (2001) for further 
developments of this latter approach). 

Historically, the discovery of the connection 
between Painlevé functions (Py in this case) and 
Toeplitz/Fredholm determinants appears in work 
of Wu et al. (1976) on the spin-spin correlation 
functions of the two-dimensional Ising model. Painlevé 
functions first appear in random matrix theory in 


B ua 0g S3 Ky 

1 —1.20653 1.2680 0.293 0.165 
2 —1.77109 0.9018 0.224 0.093 
4 —2.30688 0.7195 0.166 0.050 


Probability densities 


0.5 


0.4 
0.3 
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Figure 1 The mean (ug), standard deviation (c5), skewness 
(S5), and kurtosis (K3) of F;. 


Jimbo et al. (1980) where they prove that the Fredholm 
determinant of the sine kernel is expressible in terms of 
Py. Gaudin (using Mehta's then newly invented 
method of orthogonal polynomials (Porter 1965)) 
was the first to discover the connection between 
random matrix theory and Fredholm determinants. 


Universality Theorems 


A natural question is to ask whether the above limit 
laws depend upon the underlying Gaussian assump- 
tion on the probability measure. To investigate this for 
unitarily invariant measures (8 = 2), one replaces in [1] 


exp(—tr(A^)) — exp(—tr(V(A))) 
Bleher and Its (1999) choose 


V(A) = gA*-—A’*, g>0 


and subsequently a large class of potentials V was 
analyzed by Deift et al. (1999). These analyses 
require proving new Plancherel-Rotach type formu- 
las for nonclassical orthogonal polynomials. The 
proofs use Riemann-Hilbert methods. It was shown 
that the generic behavior is GUE; hence, the limit 
law for the largest eigenvalue is F2. However, by 
finely tuning the potential new universality classes 
will emerge at the edge of the spectrum. For 8 = 1,4 
a universality theorem was proved by Stojanovic 
(2000) for the quartic potential. 

In the case of noninvariant measures, Soshnikov 
(1999) proved that for real symmetric Wigner matrices 
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(complex Hermitian Wigner matrices), the limiting 
distribution of the largest eigenvalue is F; (respectively, 
F2). (A symmetric Wigner matrix is a random matrix 
whose entries on and above the main diagonal are 
independent and identically distributed random vari- 
ables with distribution function F. Soshnikov assumes 
that F is even and all moments are finite.) The 
significance of this result is that non-Gaussian Wigner 
measures lie outside the “integrable class" (e.g., there 
are no Fredholm determinant representations for the 
distribution functions) yet the limit laws are the same as 
in the integrable cases. 


Appearance of F; in Limit Theorems 


In this section we briefly survey the appearances of 
the limit laws Fz in widely differing areas. 


Combinatorics 


A major breakthrough occurred with the work of 
Baik, Deift, and Johansson (see Baik et al. (2000) and 
references therein) when they proved that the limiting 
distribution of the length of the longest increasing 
subsequence in a random permutation is F2. Precisely, 
if n(o) is the length of the longest increasing 
subsequence in the permutation o € Sy, then 


ln —2VN 
(et < s) Fs) 
as N— oo. Here the probability measure on the 
permutation group Sy is the uniform measure. 
Further discussion of this result can be found in 
Johansson (2000b). 

Baik and Rains (2001) showed by restricting the set 
of permutations (and these restrictions have natural 
symmetry interpretations) that F; and F4 also appear. 
Even the distributions F? and F$ (Tracy and Widom 
1999) arise. By the Robinson-Schensted-Knuth corre- 
spondence, the Baik-Deift-Johansson result is equiva- 
lent to the limiting distribution on the number of boxes 
in the first row of random standard Young tableaux. 
(Ihe measure is the push-forward of the uniform 
measure on Sy.) These same authors conjectured that 
the limiting distributions of the number of boxes in the 
second, third, etc., rows were the same as the limiting 
distributions of the next-largest, next-next-largest, 
etc., eigenvalues in GUE. Since these eigenvalue 
distributions were also found in Tracy and Widom 
(1996), they were able to compare the then unpub- 
lished numerical work of Odlyzko and Rains (2000) 
with the predicted results of random matrix theory. 
Subsequently, Baik et al. (2000) proved the conjecture 
for the second row. The full conjecture was proved by 
Okounkov (2000) using topological methods and by, 
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among others, Johansson (2001) using analytical 
methods. For an interpretation of the Baik-Deift- 
Johansson result in terms of the card game patience 
sorting, see the very readable review paper by Aldous 
and Diaconis (1999). 


Growth Processes 


Growth processes have an extensive history both in 
the probability literature and the physics literature 
(see, e.g., Meakin (1998) and references therein), but 
it was only recently that Johansson (2002b) proved 
that the fluctuations about the limiting shape in a 
certain growth model (“corner growth model") are 
F,. Johansson further pointed out that certain 
symmetry constraints (inspired from the Baik and 
Rains (2001) work) lead to F, fluctuations (see 
Growth Processes in Random Matrix Theory). 

Subsequently, Baik and Rains (2000) and Gravner 
et al. (2002) have shown the same distribution 
functions appearing in closely related lattice growth 
models. Prahofer and Spohn (2000) reinterpreted the 
work of Baik et al. in terms of the physicists’ poly- 
nuclear growth (PNG) model thereby clarifying the role 
of the symmetry parameter 8. For example, 3=2 
describes growth from a single droplet, whereas 3 = 1 
describes growth from a flat substrate. They also 
related the distribution functions F to fluctuations of 
the height function in the KPZ equation (Kardar et al. 
1986, Meakin 1998). (The connection with the KPZ 
equation is heuristic.) Thus, one expects on physical 
grounds that the fluctuations of any growth process 
falling into the 1 + 1KPZ universality class will be 
described by the distribution functions Fg or one of the 
generalizations by Baik and Rains (2000). Such a 
physical conjecture can be tested experimentally. Ear- 
lier Myllys et al. established experimentally that a slow, 
flameless burning process in a random medium (paper!) 
is in the 1 + 1KPZ universality class. This sequence of 
events is a rare instance in which new results in 
mathematics inspire new experiments in physics. 

In the context of the PNG model, Prahofer and 
Spohn have given a process interpretation, the Airy 
process, of F5. 

There is an extension of the growth model in 
Gravner et al. (2002) to growth in a random 
environment. In Gravner et al. (2002) the following 
model of interface growth in two dimensions is 
considered by introducing a height function on the 
sites of a one-dimensional integer lattice with the 
following update rule: the height above the site x 
increases to the height above x — 1, if the latter 
height is larger; otherwise, the height above x 
increases by 1 with probability px. It is assumed 
that the p, are chosen independently at random with 


a common distribution function F, and that the initial 
state is such that the origin is far above the other sites. 
In the pure regime, Gravner-Tracy-Widom identify 
an asymptotic shape and prove that the fluctuations 
about that shape, normalized by the square root of 
the time, are asymptotically normal. This contrasts 
with the quenched version: conditioned on the 
environment and normalized by the cube root of 
time, the fluctuations almost surely approach the 
distribution function F;. We mention that these same 
authors find, under some conditions on F at the right 
edge, a composite regime where now the interface 
fluctuations are governed by the extremal statistics of 
px in the annealed case while the fluctuations are 
asymptotically normal in the quenched case. 


Random Tilings 


The Aztec diamond of order n is a tiling by dominoes of 
the lattice squares [m,m + 1] x [b,£ + 1], m, n € Z, 
that lie inside the region ((x, y):|x| - |y| E 24- 1]. A 
domino is a closed 1 x 2 or 2 x 1 rectangle in R? with 
corners in Z*. A typical tiling is shown in Figure 2. One 
observes that near the center the tiling appears random, 
called the temperate zone, whereas near the edges the 
tiling is frozen, called the polar zones. As n — oo the 
boundary between the temperate zone and the polar 
zones (appropriately scaled) converges to a circle 
(*arctic circle theorem"). Johansson (2002a) proved 
that the fluctuations about this limiting circle are F3. 


Statistics 


Johnstone (2001) considers the largest principal 
component of the covariance matrix X' X where X 
is an nxp data matrix all of whose entries are 
independent standard Gaussian variables and proves 
that for appropriate centering and scaling, the 
limiting distribution equals F, in the limit zz, p — oc 
with 1/p—-y€ R^. Soshnikov has removed the 
Gaussian assumption but requires that z—p-— 
O(p!/?). Thus, we can anticipate applications of 
the distributions Fə (and particularly Fi) to the 
statistical analysis of large data sets. 


Figure 2 Random tilings. 


Queuing Theory 


Glynn and Whitt (1991) consider a series of n single- 
server queues each with unlimited waiting space 
with a first-in and first-out service. Service times are 
i.i.d. with mean one and variance c? with distribu- 
tion V. The quantity of interest is D(k,7), the 
departure time of customer k (the last customer to 
be served) from the last queue n. For a fixed number 
of customers, k, they prove that 


D(k,n) —n 
ayn 


converges in distribution to a certain functional D, 
of k-dimensional Brownian motion. They show that 
D, is independent of the service time distribution V. 
It was shown in, for example, Gravner et al. (2002) 
that D, is equal in distribution to the largest 
eigenvalue of a kx k GUE random matrix. This 
fascinating connection has been greatly clarified in 
recent work of O’Connell and Yor (2002). 

From Johansson (2002), it follows for V Poisson that 


p( Pees — cn 


< s) — F»(s) 


as n — oo for some explicitly known constants cj 
and c» (depending upon x). 


Superconductors 


Vavilov et al. (2001) have conjectured (based upon 
certain physical assumptions supported by numer- 
ical work) that the fluctuation of the excitation gap 
in a metal grain or quantum dot induced by the 
proximity to a superconductor is described by F, for 
zero magnetic field and by F5 for nonzero magnetic 
field. They conclude their paper with the remark: 


The universality of our prediction should offer ample 
opportunities for experimental observation. 
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Introduction 


This section introduces some elementary notions 
and sets the (mathematically low brow) tone of this 
presentation. 

A dynamical system is characterized by an evolu- 
tion equation the general structure of which reads 


ids = [1] 


Here O = O(x,t) is the dependent variable, and it 
might be a scalar, a vector, a matrix, you name it. 
The focus of interest is on its evolution as function 
of the (real, scalar) “time” variable t. The a priori 
unknown quantity O might moreover depend on 
another independent “space” variable (scalar or 
vector) x, O = O(x,t). The appended variable t in 
the left-hand side of the above equation denotes 
partial differentiation, and this notation will be used 
throughout, although when t is the only independent 
variable differentiation with respect to it might be 
instead denoted by a superimposed dot: 


The quantity in the right-hand side of the evolution 
equation (1), which has of course the same (scalar, 
vector, matrix) character as OQ, is an assigned 
function of t, x and OQ, F = (x,t, Q) (more generally, 
its dependence on QO might be functional, see 
below). A typical example of the dynamical systems 
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we shall consider is the N-body problem character- 
ized by the Newtonian equations of motion 


ec 2 
qn = — W Qn 


N 
+28 "^ (da—4») , 


m-—]1.m-in 


na12..N 2 


where the dependent variable is the N-vector g = 
(q1,... qy), the components of which are the “particle 
coordinates" qn = q,(t). Note however that these 
equations of motion are of second-order in time 
(contrary to (1)); but they can of course be reformulated 
as first-order ODEs indeed their Hamiltonian version, 
derived in the standard manner from the Hamiltonian 


-33 (b3 + «^ q;) 
TE PA — dm) [3a] 


reads 
dn = Pn [3b] 
Pn = —w an 


+ 2g? 3 


m-—]1.m-én 


») ,n21,2,...N [Bc] 


Other typical examples are the (“Korteweg-de 
Vries”, “Burgers”, “Nonlinear Schródinger", “sine 
Gordon”) PDEs satisfied by the scalar dependent 
variable g = q(x, t), 


qt — —dxxx + 2dxq = (das i q^), [4] 
dt — —dxx + 24xq = (—4x i q^), [5] 
qt = ids + slala], s=+ [6] 


Gt — dx = S, St + Sx = sing [7] 


as well as the integrodifferential (*Benjamin-Ono") 
equation 
To, dy) 
qı j yey 44 [8] 
and the (“Kadomtsev—Petviashvili”) PDE satisfied by 
the scalar dependent variable g = q(x, y, t), 


qix = (—dxxx + dxd),- Sayy: S — E [9] 


This last equation should of course be reformulated 
as an integrodifferential equation to fit with (1). 
These are all examples of integrable systems (see 
below). In this presentation we restrict attention to 
dynamical systems of these general types, without 
considering evolutions in which the space variable, 
and/or the time variable, and/or the dependent 
variable, only take discrete values, forsaking thereby 
the discussion of discrete evolution equations, 
cellular automata and functional equations, see 
other entries of this Encyclopedia. We shall consider 
mainly the “initial-value problem” in which the 
solution is assigned at the initial time, say at t — 0, 


O(x, 0) = Qo(x) 


and the subsequent evolution of the dependent 
variable, namely the values taken by O(x,t) for t > 
0, is the focus of attention. Note however that, 
except when there is no dependence at all on the 
space variable x (see for instance (2)), the functional 
class to which QO(x,t) belongs as regards its 
x-dependence should be specified (and the assigned 
initial-value Oo(x) should of course belong to this 
functional class). A typical class of functions are 
those vanishing (adequately fast) at (spatial) infinity; 
another typical class are those characterized by 
periodicity properties as functions of x; and still 
another class are those restricted to a finite spatial 
domain (for instance, the positive x-axis, x > 0, or a 
finite interval, a < x < b), in which cases the initial- 
value problem must be supplemented by assigning 
boundary conditions. These latter class of problems, 
called initial/boundary-value problems, are generally 
more difficult; even the identification of which 
boundary conditions are adequate to identify 
uniquely the solution may be a nontrivial task. In 
the following we will always focus on the simpler 
class of problems characterized by solutions defined 
in the entire space region and vanishing (sufficiently 
fast) asymptotically (far away). 

Thus, in the spirit of the initial-value problem, a 
dynamical system is generally characterized by 
assigning its evolution equation, the functional 
class to which its solutions are required to belong, 
and possibly in addition some (additional) restric- 
tion on the set of initial data. 
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Let us finally mention that, aside from considering 
the initial-value problem, the study of dynamical 
systems may focus on the identification of special 
(classes of) solutions, for instance those obtained by 
using symmetry properties of the evolution equation 
under consideration (yielding, say, “similarity solu- 
tions"), and, in the integrable case, “solitonic” and 
*multisolitonic" solutions (see below). 


Integrable dynamical systems 


The solution of a dynamical system, however simple 
the equation that defines its time evolution, see (1), may 
be extremely complicated, indeed its time-dependence 
might feature one or more of the characteristics of 
deterministic chaos, such as a sensitive dependence on 
the initial data. But there are “exceptional” dynamical 
systems, the behavior of which is instead, in some 
sense, simple. Such systems are termed — in the least 
technical sense of the word - “integrable”. 

This characterization can be made precise for 
Hamiltonian systems with a finite number N of degrees 
of freedom, the equations of motion of which read 

OH (p,q) 9H (p,q) 

a = ——~—_ n=1],...N 
OPn e 

Such a system is integrable if there exist, in addition 

to the Hamiltonian H(p,d) =H" (p,q) itself, 

N —1 other (nontrivial and functionally indepen- 

dent) constants of motion H™ (p, d) in involution, 

namely such that their Poisson brackets vanish: 


ae, Hm} 


A | OH” (p, q) OH (p, 4) 
Og: Ope 


Hn 3 n 


=f) 
Oqi Ope 


gn. 41 — 1. N 


 OH'"(p,q) OH” (p. 2 


Let us however emphasize the crucial role of the words 
“there exist", as used just above. For definiteness let us 
require that the constants of motion H™ (p,ġ) be 
analytic functions of their 2N arguments, and not 
excessively multivalued: they might feature some 
branch points, but not so many to vanify their 
effectiveness in constraining the time evolution of the 
dynamical variables g,(t),p,(t) sufficiently to avoid 
their behavior from being too complicated. On the 
other hand it is of course not necessary that these 
functions H” (p, d) be explicitly known. 

When these conditions hold it is in principle 
possible (“Liouville theorem") to identify a 
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canonical transformation from the canonical coor- 
dinates and momenta q, and p, to action-angle 
variables 0, and I, such that 


I, =H (5, q) [10] 
Then these action variables evolve trivially, 
L(t) = 1,40),0, 0) = 0,00) + 1,(0)}¢,2 = L,...N 


Note that, once these new canonical variables are 
identified, the solution of the initial-value problem for 
the original Hamiltonian problem is provided directly 
by the expressions of the action-angle variables 0, and 
I, in terms of the original variables g,, and p,, as well 
as the expressions of the latter in terms of the former. 
The second step of this procedure requires inverting 
the expressions (10), and the corresponding expres- 
sions of the angle variables 0, in terms of the original 
variables q, and p,; a necessary condition in order that 
this step allow to identify uniquely, at least in 
principle, the original canonical variables q, and p, 
in terms of the action-angle variables I,, and 6, — hence 
imply a simple time-evolution of these original vari- 
ables — is the requirement, as mentioned above, that 
the expressions of the constants of the motion 
H™ (p, d) in terms of their arguments q, and p, not 
be excessively multivalued. 

The statements outlined above can be rigorously 
formulated for finite-dimensional Hamiltonian sys- 
tems, and they can be heuristically extended to all 
analogous dynamical systems with a finite number of 
degrees of freedom, even if they are not Hamiltonian. 

A system with N degrees of freedom might possess 
more than N constants of motion. Such a system 
that possesses 2N — 1 (nontrivial and functionally 
independent) constants of motion (the maximal 
number, to avoid the evolution being frozen) is 
called superintegrable, and its evolution is in some 
sense analogous to that of a system with a single 
degree of freedom, in particular all its confined and 
nonsingular motions are then completely periodic, 


Galt + T) = galt). palt + T) = pDU),m — L,...,N 


The period T depends generally on the initial data. If it 
does not, at least for an open set of such data having 
full dimensionality in phase space, the system is called 
isochronous: all its motions in that phase space region 
are then completely periodic with the same period. 

A dynamical system might be integrable in a region 
of its “natural” phase space, and nonintegrable in 
another region. Sometimes such systems are referred to 
as partially integrable. There even are systems which 
are isochronous (hence superintegrable) in a region of 
their phase space, and behave instead chaotically in 
another region. These regions are generally separated 
by boundaries where the evolution of the system runs 


into singularities, and the constants of motion asso- 
ciated with the integrable behavior become excessively 
multivalued in the regions where the behavior is 
chaotic. (see Isochronous Systems). 

Dynamical systems featuring an additional space 
variable x (see Section 1) can be interpreted as infinite- 
dimensional dynamical systems (by considering the 
variable x as a continuous label for the dependent 
variable O). Accordingly, a necessary condition in 
order that such systems be considered integrable is the 
requirement that they possess an infinite number of 
constants of the motion. But — even for such systems 
that allow a Hamiltonian formulation - this condition 
cannot be considered sufficient (due to the inherent 
ambiguities in the counting of infinities), and in fact a 
completely cogent, universally accepted definition of 
integrability for infinite-dimensional dynamical sys- 
tems is still lacking (various definitions can of course 
be given in special contexts). It is nevertheless rather 
well understood by practitioners what is meant by 
such a term at least for integrable equations such as 
those indicated at the end of the previous section, 
which generally give rise to the solitonic phenomenol- 
ogy — as explained below. 

The study of integrable systems has an illustrious 
history, to which many eminent mathematicians and 
mathematical physicists contributed after the 
Newtonian revolution: Euler, Jacobi, Poincaré, Pain- 
levé, Kowalewskaya, Kolmogorov, Moser ... Below 
we report — most tersely — on the bloom that this topic 
has witnessed over the last 3-4 decades, without being 
generally able, due to space constraints, to attribute 
the appropriate credit to the many colleagues, most of 
them still living, who contributed to this endeavor. For 
more detailed treatments of the topics outlined below, 
of related developments not mentioned here, and of 
such credits, the interested reader is referred to the 
bibliography given below, including the additional 
references traceable from there. 


Integrable many-body problems 


An important class of integrable dynamical systems 
is provided by N-body problems characterized by 
Hamiltonians such as 


way Ie " 
H(p.d) =5 > Pa + V4) [11] 
n=1 
with a potential energy V(q) that includes “exter- 


nal” and “two-body” forces, 


n m,n-—limÉn 


N 1 N 
V4) => Vn) +5 >,  VO(Gn — am): 
=j 
VAa) = V(q) [12] 


The corresponding Hamiltonian and Newtonian 
equations of motion read 


| AV (qn S. VË (qa — am 
Qn = Dn, Pn = — J - >D d £ ) 
O4n m-—1,mn Ogn 
ƏV” (a, N 0VOX(a, — don 
(dn) _ Y zd dm) 13 
dan m-—1,mn — dn 


The Lax pair and the constants of motion Suppose 
that two N&N matrices L=L(p,g) and M = 
M(p,q) could be found such that the matrix “Lax 
equation” 


L = [L, M] [14] 


be equivalent to the Hamiltonian equations of 
motion (13). Here and throughout the notation 
[A, B] denotes the commutator: 


[A,B] =AB—BA 


Because this matrix equation clearly entails that 
the N traces 


Py = tac LO" ee Bnew 


are constants of the motion, 


T, =0,n=1,...,N 

the possibility to write the Hamiltonian equations 
(13) in the Lax form (14) yields as a bonus N 
constants of the motion, namely it entails that the 
Hamiltonian system under consideration is integr- 
able. (One must moreover show that these constants 
of motion are in involution; this is usually the case). 

Hence a route to identify integrable N-body pro- 
blems is via the search of Lax pairs L, M of matrices 
such that (14) correspond to (13), with an appropriate 
assignment of the potential energy (12). For N > 2 this 
is a nontrivial task, because (13) is a system of 2N 
ODEs in 2N unknowns, while the matrix Lax 
equation (14) amounts to a system of N? ODEs. 


Functional equations and the identification of 
integrable many-body problems A convenient 
ansatz to identify a Lax pair suitable for the purpose 
outlined above reads as follows: 


Lon — Dn for n= nm, Lam = a(Gn E^ " for m y n, 
N 
M nm = » Blan a qi) for n = nm. 


(n 
M nm = Wn = Gn); for m zn 


where o(q),8(q) and y(q) are 3 functions to be 
determined. It is then easily seen that these functions 
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may be assigned so that the corresponding Lax 
equation (14) be equivalent to the Hamiltonian 
equations (13) with 


V(q) — 0 [15a] 


V? (q) = a(a)a(-4) [15b] 


provided the function a(x) satisfies the functional 
equation 


a(x)a'(y) — a(y)o' (x) 


o(x 4- y) = B(x) — Bly), B(x) = 8(—x) 


The general solution of this functional equation 
yields via [15b] the two-body potential 


V? (q) = g^ a p(a q]u, w’) [16] 


where g and a are two arbitrary constants and 
¢o(x|w,w’) is the Weierstrass elliptic function (with 
semiperiods w and w’, as well arbitrary). One 
concludes therefore that the N-body problem char- 
acterized by the Hamiltonian (11) with (12), (15a) 
and (16) is integrable. 

This Hamiltonian system has played, since the mid- 
seventies, a seminal role in the developments of finite- 
dimensional integrable systems that occurred over the 
last few decades. However, since the Weierstrass 
function is doubly-periodic, from a “physical” point 
of view this N-body problem is rather unrealistic, or 
perhaps rather suited for the study of crystalline 
configurations, including their statistical mechanics. 
But there are two special cases, obtained by assigning 
an infinite value to one or both of the semiperiods of 
the Weierstrass function in (16), that qualify V'?)(q) as 
a physical two-body potential: 


voq) - —E 


— sinh^(aq) [17l 
Vg) -£ [17b 


(Of course the second of these two-body potentials, 
(17b), is merely the special case of the first, (17b), 
corresponding to a — 0). These Hamiltonian models 
are then naturally interpretable as one-dimensional 
many-body problems with repulsive two-body forces 
singular at zero separation and vanishing at large 
distances. Actually the fact that these systems are 
integrable is far from remarkable, since it is 
generally true that any many-body problem char- 
acterized by repulsive forces vanishing at large 
distances (hence causing unconfined motions) is 
integrable: indeed in such models the particles 
eventually separate and move freely, so that their 
trajectories cannot display the extreme complication 
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characterizing a chaotic (ie., nonintegrable) beha- 
vior. But these models are in fact superintegrable 
and they (as well as various integrable extensions of 
them) feature many (physically and mathematically) 
interesting properties. For instance the asymptotic 
behavior of their trajectories, 


dn(t) = pt + q\*) + o(1) ps (t) = pi^ + o(1) 
as d — 00,2 = 1,....N [18] 


is characterized by the simple rules 


| (—) 
pi" = PN-:a-m = I 


N 
P= Y Ale -opga [19 


m=1,msén 


log E + (ga/p)" 
2a 


The formula (19) indicates that the shift gs = qs? 
among the asymptotic positions of the particles (see 
(18)) is merely a sum of two-body shifts A (which 
incidentally vanish altogether if a — 0, namely in the 
(17b) case), and it only depends on the velocities 
p\ of the particles in the remote past (not on the 
corresponding asymptotic positions qv, in spite of 
their relevance in determining the order in which the 
different particles approach each other through the 
motion). 

A generalization of the above model in the (17b) 
case — nontrivial inasmuch as it yields confined 
motions — is characterized by the additional presence 
in the potential (12) of the one-body potential 


Vq) 2$ wq [20] 


A(p; g,4) = sign(p) 


yielding the Hamiltonian (3a). This model is integr- 
able, indeed superintegrable, indeed isochronous, all 
its (real) solutions being completely periodic with 
period 


_2n 


T [21] 


W 
A neat way to understand this result is by noting 
that, if g(t) is a (possibly complex) solution of the 
model discussed above (in this subsection, with the 
two-body potential (17b) and no one-body poten- 
tial, see (15a)), then 
juka exp(2 wt) — 1 
n\t) = megh T = — — —— 
qn(t) = exp(-iut)qu(r), T 215 
provides a (possibly real) solution of the Newtonian 
equations of motion (2), namely of the same model 


but with the additional one-body potential (20). 
Remarkably this model was solved firstly in the 
quantal case (at the beginning of the seventies), and 
only a few years later in the classical case considered 
here (by J. Moser, who, for the w=0 case, 
introduced the special version of the Lax matrix 
appropriate for this case). 

Another class of many-body problems, introduced 
in the mid-sixties by M. Toda, played a seminal role 
in the study of integrable dynamical systems, indeed 
the first application (independently by H. Flaschka 
and S. Manakov) of the Lax approach to integrable 
many-body problems occurred in that context. This 
model is often referred to as the Toda lattice, 
because its (two-body) interaction (of exponential 
type) is only assumed to act among "nearest 
neighbors". 

A particularly interesting, and just as integrable, 
generalization of this class of Hamiltonian many- 
body problems features an extra parameter, say c, 
which might be considered to play the role of “speed 
of light". These models reduce to those considered 
above for c= oc, and for finite c they are invariant 
under the Poincaré group of coordinate transforma- 
tions (while of course the many-body problems 
described above are invariant under the Galilei 
group). They are sometimes termed RS models, to 
recognize those who first introduced them 
(S. Ruijsenaars and H. Schneider) as well as the 
possibility to interpret them in some sense as 
“relativistic” generalizations of the “nonrelativistic” 
models described above. 


Reduction of the solution to algebraic opera- 
tions The solution of the models described above 
can actually be reduced to purely algebraic opera- 
tions. For instance for the model characterized by 
the Newtonian equations of motion (2) such a 
solution of the initial-value problem is provided by 
the following prescription: the particle coordinates 
qn(t) coincide with the N eigenvalues of the N & N 
matrix: 


Sint) | an — 


Onm(t) = qn(0) cos(wt) + qn(0) 


ig sin(wt) 
w|qs(0) — 4m(0)] 


for n Æ m 


Onm(t) = 


Many-body problems related to the motion of the 
zeros of linear PDEs Another convenient approach 
to manufacture and investigate integrable many- 
body problems is by identifying the motion of the 
particles with that of the zeros of (polynomial) 


solutions of linear (hence solvable) evolution PDEs. 
Assume for instance that the monic polynomial 


N 
W(z,t) = xN + 3 Cu x" - 
m-1 


satisfies the (compatible) linear PDE 


N 
3 [Ife -z(t)  |22] 


[Ao - Aiz - Ao Z? + As Yz 
+ [Bo + B1z - 2(N — 1)Asz?]v 
+ Coby + [E — (N — 1)D2 z|yi 
+ [Do + Diz + D2 z^] vz 
- [N(N —1)(A2 —A3z)+NBi]p=0 [23] 
where the letters Ag,A1,A2,A3,Bo,B1,C,Do, Dı, 


D2,E denote 11 arbitrary constants. Then the zeros 
z,(t) evolve according to the system of ODEs 


C2, = EZ, = Bo -T Biz, = 2(N n 1)A32;, 


T. $^ d — Zu) 
m-1 =| 
x PE = (£, t Z4)(Do Es D1z,) 


= D5z(Za2m T Sou) 
+2(Ao + Atgn + A222 + A3z3)] [24] 


interpretable as the Newtonian equations of motion 
of an N-body problem with one- and two-body 
(velocity-dependent) forces. This problem is integr- 
able, indeed its solution can be reduced to the 
algebraic problem of finding the zeros of the 
polynomial y(z,t), see (22), whose time evolution 
can be ascertained by solving the linear PDE (23), 
itself a purely algebraic problem as it amounts to 
solving the system of (constant coefficients, linear) 
ODEs implied via (22) by this PDE (23) for the N 
coefficients c, (t). 

This class of many-body problems is rather rich, 
thanks to the arbitrariness of the 11 constants it 
features. Several subcases, characterized by special 
choices of these constants, are suitable to display a 
gamut of different phenomenological behaviors: 
confined and nonconfined motions, periodic and 
nonperiodic evolutions, limit cycles, Hamiltonian 
CASES. vv 


Solvable many-body problems in the plane The 
many-body problems considered above were all 
essentially oze-dimensional. But via a simple trick 
it is possible to obtain from some of them many- 
body problems in tbe plane (which should of course 
be rotation-invariant to be certified as such). 
Consider for instance the special case of the above 


Integrable Systems: Overview 111 


model, (24), with C=1 and with Aọ = A1 = A3 = 
Bo = Do = D2 =Q so that its equations of motion, 


N 
Zn + Ez, = Biz, T T (25, = im) 


m-—]1.m-n 


are invariant under rescaling of the dependent 
variables (z, —»cz,). Let us then assume to work 
in the complex rather than the real, and let us set 


F=7+w, A2 a4 10, Bı = B+ if, 


D; — 6-4 ió 


where the Greek letter indicate now real constants, 
and let us moreover relate the N complex coordi- 
nates z, to N two-vectors 7, in the horizontal plane 
via the self-evident positions 


Zn = Xn + ins Pn = (Xn, Yn, 0); k = (0,0,1) — [26] 


It is then easily seen that the integrable equations of 
motion (25) become the following rotation-invariant 
Newtonian equations of motion identifying a (no 
less integrable) N-body problem in the plane: 


Tn + (a + wka) f. 


= (à + Bk^)?, + 


- (8 + kA) f (5, + 2 [2 — (7, - di 


—— Tr Fn x (5, + 73] +7 Md 
T 2(a + k^) TA [ri — 2(%n°7m)| + h.n) [27] 
Here and below we use the short-hand notation 
Fam =Tn —Tm entailing rt i r + " —27,-T,, the 
symbol ^ denotes the three-dimensional vector pro- 
duct so that k A 7, = (—y,, Xn, 0) (see (26)), and the rest 
of the notation is self-evident. Note that these rotation- 
invariant Newtonian equations of motion are also 


translation-invariant if 8 — 0 —6— 6=a=a=0. 


The “goldfish” model The attribute of “goldfish” 
has been attributed to the special case of the above 
model with all “coupling constants” vanishing, 
thanks to the neatness of its equations of motion, 
which in their complex version read 


m=1,.msn 
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and in their real (“physical”) version as Newtonian 
equations of motion of an N-body problem in the 
horizontal plane read 


N n (7 m^ fas] sr F m (7 n' Ran) — Fnm (5 in) 


y2 
m=1.mA~n nm 


9L 


fc 


(This name has also been attributed to some 
extensions of this model, see the entry Isocbronous 
Systems in this Encyclopedia). This model is 
invariant under time rescaling (t= cf), in its 
physical version it is translation- and rotation- 
invariant, it only features two-body forces and in 
spite of their velocity-dependence it is Hamiltonian 
(it is in fact a simple instance of the RS models 
mentioned above). The solution of its initial-value 
problem (in its complex version) is given by a 
remarkably neat rule: the N coordinates z,,(t) are the 
N roots of the following algebraic equations in z: 


s &(0 1 
beg —-z(0) t 28] 


The phenomenology of its generic solution is also 
remarkable, corresponding to the “game of musical 
chairs": in the remote past all particles but one are 
almost at rest in N — 1 positions (“sitting in N — 1 
chairs") and one particle comes in from infinity, 
moving initially as a free particle; as it approaches, 
all the particles begin to move around (*dancing"); 
in the remote future one particle goes away (moving 
eventually with the same speed as the incoming 
particle), and all the others settle down in the same 
N — 1 positions (*of the N — 1 chairs"), but with 
the possibility that the outgoing particle be different 
from the incoming one, and that the other particles 
have reshuffled their “seating”. 

Another remarkable version (also translation- and 
rotation-invariant, as well as Hamiltonian) of the 
N-body model in the plane (27) obtains if all the 
“coupling constants" vanish except w. Then all its 
nonsingular solutions — which are given by the same 
prescription indicated just above, except for the 


replacement of + with ——“—~ in the right-hand side 


of (28) — are completely periodic with periods which 
are an integer multiple — no larger than a number 
depending on N, generally (much) smaller than N! — 
of T (see (21)), the domains of phase space that give 
rise to solutions with different periodicity being 
separated from each other by boundaries character- 
ized by lower-dimensional sets of initial data 
yielding trajectories that run into singularities 


corresponding to particle collisions (note that when 


two or more particles collide their individuality gets 
lost, and their velocities diverge). 


Integrable many-body problems in spaces with 
arbitrary dimensions Integrable, or even solvable, 
many-body problems in spaces with more than two 
dimensions — with rotation-invariant equations of 
motion of Newtonian type — can be manufactured 
by starting from an appropriate integrable, or 
solvable, second-order matrix evolution equation, 
and by then parametrizing the evolving matrix in 
terms of multidimensional vectors so as to transform 
the matrix evolution equation into a covariant — 
hence rotation-invariant — system of evolution 
equations for these vectors, interpretable as New- 
tonian equations of motion of a many-body problem 
in multidimensional space. 
For instance the matrix equation 


M = AM + MA + M? 


is integrable. Here M = M(t) is a square matrix of 
arbitrary order and A is an arbitrary constant 
matrix. By parametrizing appropriately these two 
matrices one concludes that either one of the 
following two Newtonian systems of ODEs is 
integrable: 


N 


^ M N 
fnm = ) Qnvl um +t ) ) SIT í Fn) 


v=] pl v=] 


- M N 
‘nm > ) O gp yn + ) ) Top [fr i an 


v=] p=1 v-l 


Here N and M are arbitrary positive integers, the 
NM constants oy, are also arbitrary, the NM 
“particle coordinates” 7,, — r,4,(t) are S-vectors, 
with S an arbitrary positive integer, and the dots 
sandwiched among these S-vectors denote the 
standard scalar product in S-dimensional space. 

Let us emphasize the physical relevance of this 
class of many-body problems, characterized by 
linear and cubic forces. This is reinforced by the 
fact that these models are Hamiltonian. 


Nonlinear harmonic oscillators Two classes of 
integrable systems obtain from the classes written 
above by first setting to zero all the constants ay», 
and by then performing the change of variables 


exp(iwt) — 1 


Dunt) — EXPE) fnm (T), T = [29] 


IW 


with w> 0. The corresponding Newtonian equa- 
tions of motion read 


" M N 
lnm — StWW yy — LW ym = ) X Uy (Wry TA 
pol vl 


a) EE MEE SS 


» M N 
Wm — Stut0ym = 20am = ) ) Uy (ww vi ` flnn) 
p=) 


v=1| 


These equations of motion cause the N M evolving 
S-vectors Wyy, = Wyy(t) to be complex (see the 
second term in their left-hand sides), but a real 
system (with double the number of dependent 
variables) can be easily obtained by setting 


=> — A 
Wnm nd Unm F IV nm 


Remarkably (but clearly suggested by (29)), all the 
nonsingular solutions of each of these two many- 
body problems are completely periodic, with a 
period which is an integer multiple of the period T, 
see (21). This justifies the title given to this 
subsection. It also shows that these are isochronous 
systems (see Isochronous Systems). 


Integrable nonlinear PDEs 


As indicated in Section 1 another class of integrable 
systems are nonlinear evolution PDEs. In this 
section we outline (some of) their properties, 
focussing mainly on the Korteweg-de Vries PDE 
(4), the solution of which by C. S. Gardner, 
]. M. Greene, M. D. Kruskal and R. M. Miura in 
the mid-sixties was the opening shot of a major 
scientific development which is still blooming. 
Other important early steps of this development 
were, in the late sixties, the introduction by P. D. 
Lax of what is now called the Lax pair technique, 
and at the beginning of the seventies the solution by 
V. E. Zakharov and A. B. Shabat of the Nonlinear 
Schródinger equation (6) — an evolution PDE of 
great applicative importance. Subsequently many 
researchers developed various techniques to iden- 
tify, classify and investigate integrable nonlinear 
PDEs, a continuing activity for an overall appraisal 
of which the interested reader is referred to the 
bibliography reported below. 

Here we outline one of the approaches to 
obtaining these results; other approaches are tersely 
mentioned below. 
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Identification and investigation of integrable 
PDEs via the inverse spectral transform 
technique 


The class of linear dispersive evolution PDEs reads 
u(x, t) = —i m. u(x,t), —00 < x < [30] 
tX, — QJ ax yl), X OO 


where the “dispersion function” w(z) is, say, a (real) 
polynomial (which must be odd to guarantee that 
this PDE be real). The solution of this PDE is 
achieved via the introduction of the Fourier trans- 
form Z(R, t), 


u(x,t) = (2x) | f ak expli kx) i(k, t) [31a] 


&(h,t) = f _dxexp(-ikx)ulx,t) [31b] 


whose evolution corresponding to (30) is then given 
by the simple linear ODE 


(k, t) = —iw(k)ü(h,t), —oo < k < oo [32a] 
which can be immediately integrated: 
(k, t) = u(k, 0) exp|—iw(k)t] [32b] 


Thus the solution of the initial-value problem of (30) 
is achieved via three steps: (i) at the initial time one 
obtains the initial value of the Fourier transform, 
(k, 0), from the initial datum u(x, 0) (via (31b)); (ii) 
one then obtains (k,t) (via (32b)); (iii) one finally 
obtains u(x,t) (via (31a)). From these formulas the 
main features of the resulting phenomenology are 
easily evinced (even when the above integrals cannot 
be explicitly performed). 

A class of integrable nonlinear evolution PDEs 
reads 


u,(x,t) = a(R)u,(x, t) [33] 


where the assigned function a(z) is again, say, a 
(real) polynomial, while R is now the integrodiffer- 
ential “recursion operator” defined by the following 
formula that specifies its action on a generic 
function f(x,t) (vanishing asymptotically so as to 
allow all integrations to converge): 


R f (x, t) = fxx(x, t) — 4u(x, t)f (x, t) 
+ 2us(x, t) J T dy f (y, t) [34] 


Note that the presence of the time variable ¢ plays 
no relevant role (it is merely parametric). A 
remarkable property of this operator - which 
depends on u(x,t) — is that any power of it acting 
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on u(x,t) yields a nonlinear combination of u(x,t) 
and its x-derivatives — without any left-over integra- 
tion, in fact yielding a result which is itself an exact 
x-derivative, ready for exact integration in case of a 
further application of R, see the last term in the 
right-hand side of (34). For instance 

Ruy = uus — 6x U = (Uxx — ah 


= (Uxxxx — 10Uxx u — Su + 1027), 


and so on. Hence the simplest nonlinear evolution 
equation contained in the class (33) is the Korteweg- 
de Vries (KdV) equation 


Uy + Uyyy = tty U [35] 


(corresponding to o(z) — —z; and note the identity 
with (4), via the trivial rescaling q(x,t) — 3 u(x, t)). 
Note that, if one neglects all nonlinear contribu- 
tions, the class (33) reduces to (30) with 


w(z) = —za(—z*) 


The solution of this class of nonlinear PDEs, (33), 
is given by a somewhat analogous procedure to that 
described above for the class of linear dispersive 
PDEs (30). 

Firstly, one introduces the spectral transform, a 
nonlinear generalization of the Fourier transform 
which indeed reduces to it if nonlinear effects are 
altogether neglected. That relevant for the class of 
PDEs (33) is based on the spectral problem 
associated with the linear Schrödinger operator 


PIU 
L= -(2) +u(x, t), —00 < x < oo [36] 


Via it, the spectral transform 


S{u(x, t)] = {R(k, t), =00 < k « oo; pa, pn(t), 
UN PENES UU [37] 


is introduced. Here the function R(k,t) is the 
“reflection coefficient” associated to the eigenvalue 
k? of the continuous spectrum of L, while the 
nonnegative number N gives the number of discrete 
eigenvalues of L, and the positive quantities p, and 
p» (t) are associated to these discrete eigenvalues, 
specifically —p? are the “binding energies”, and 
pn(t) the “normalization coefficients", associated to 
the “bound states” possessed by the “potential” 
u(x,t). (All this terminology comes from the inter- 
pretation of the above spectral problem in quantum- 
mechanical terms). And it can be shown not only 
that there is a one-to-one correspondence among a 
function u(x,t) and its spectral transform S[u(x, t)], 


but moreover that both the direct spectral problem 
to compute S|u(x,t)| from u(x,t) (arbitrarily 
assigned within an appropriate class), and the 
inverse spectral problem to compute u(x,t) from 
S|u(x, t)] (arbitrarily assigned within an appropriate 
class), only entail solving linear equations (an ODE 
in the former case, a Fredholm integral equation in 
the latter case). 

Note that, in the above definition of the spectral 
transform, the time variable t plays merely a 
parametric role. But the usefulness of this spectral 
transform to solve the PDE (33) resides in the fact 
that, if u(x, t) evolves in time according to this PDE, 
the corresponding evolution of the spectral trans- 
form is quite simple: the number N and the positive 
numbers fp, are time-independent (as already 
implied by our notation), while the time evolution 
of the reflection coefficient R(k,t) and of the 
normalization coefficients p,(t) is given by the 
simple linear ODEs 


R,(k, t) = 2ika(—4k7) R(k,t), —oo < k < oc [38a] 

Palt) = —2p,o(Apl)p,(t),n — 1,..., N — [38b] 
which can be readily integrated: 

R(k,t) = R(k, 0) exp |2ika(—4k*)t] [39a] 

Pa(t) = p(0)exp|-2p,o(4p;)t] [89b] 


Hence the solution of the initial-value problem for 
the class of nonlinear PDEs (33) can now be 
achieved via the following three steps: (i) at the 
initial time, via the solution of the direct spectral 
problem, the spectral transform S[u(x, 0)] (see (37)) 
is obtained (from z(x, 0), arbitrarily assigned within 
an appropriate class); (ii) the spectral transform at 
time ¢ is then obtained via (39); (iii) by solving the 
inverse spectral problem, u(x,t) is obtained from 
S[u(x, t)] (see (37)). 

The analogy of this procedure to that outlined 
above for the class of linear dispersive PDEs (30) is 
clear, and the fact that in this manner the solution 
of the initial-value problem for the nonlinear PDEs 
(33) can be achieved via a sequence of steps 
involving only the solution of linear problems is 
an indication of the integrable character of this 
class of nonlinear evolution PDEs. And it allows to 
gain thereby a lot of insight on the behavior of 
these solutions, and also to construct classes of 
explicit solutions of these equations, as we now 
indicate. 


Solitons 


The integrable nonlinear PDE (33) possesses the 
single-soliton solution 


u "EN. M 
wÜTlp-EQO 7 
u 4 pt) | 
E(t) = (2p) log E = €(0) + vt, 
v = —a(4p’) [40b] 


to which corresponds the simple spectral transform 
S[u(x,t)] = {R(k,t) =0;p1 =p, 
pi (t) = p(t) = p(0)exp[-2po(4p?)];N —1) [41] 


This solution, (40), describes a localized wave of 
constant shape moving with the constant speed v: 
the “soliton”. It is characterized by two (real) 
parameters, £(0) and p. The first identifies the 
initial location of the soliton; its arbitrariness 
corresponds to the translation invariant character 
of (33). The second, p, the spectral significance of 
which is clear from (41), determines the shape of 
the soliton (both its “height” 2p? and its “width” 3 
as well as its speed v (see (40b)); note that the 
shape is identical for all the nonlinear evolution 
PDEs of the class (33), while the speed depends on 
the function a(z), see (40b), namely it depends on 
which specific equation of the class (33) one is 
considering. For instance for the KdV equation 
(35), corresponding to a(z) — —z, the speed of the 
soliton is 


y = 4? [42] 


thus all solitons of the KdV equation move from left 
to right, and taller and thinner solitons move faster 
than less tall and more fat ones. 

More generally, every PDE of the class (33) 
possesses the N-soliton solution 


O 


5) ‘log det|I + C(x, £)] [43a] 


ux. t) = zi 
Here I is the N& N unit matrix and C(t) is the 
N & N matrix 


(t) 1/2 exp|—(Pm +Pn)x] [43b] 
Pm Tt Dn 


where the time-evolution of the p,(t)'s is given by 
(39b). Indeed the spectral transform of this solution 
is given by (37) with R(k,t)=0 and p,(t) given by 
(39b). To discuss the multisolitonic phenomenology, 
let us focus on the KdV equation, so that the speed 
of each soliton is given by the simple formula (42) 


Coi (x. t) = [Pm (t) Pn 
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and let us order the N positive numbers p, in 
increasing order, 


Pi < P3 < -*- < Pn 


so that the corresponding soliton velocities, 
v, — 4p?, are as well ordered in increasing order: 


Hi € U, € 9 DW 


The N-soliton solution (43) is not so transparent, 
especially if N is large, but it becomes quite simple 
in the remote past and future: 


N 


ux. fm oh 
n 24 cosh” (ps[x — En(t)]} 


Enlt) —&* + vnt, t + +00 


with the 2N (real) constants ee related to one 
another (see below). It is thus seen that, both in the 
remote past and future, the N-soliton solution (43) 
splits into the sum of N separated solitons. In the 
remote past the solitons are ranged, from left to 
right, in order of decreasing amplitude, and they 
move to the right with speeds ordered in decreasing 
magnitude; then the taller and faster solitons 
gradually catch up and eventually “overtake” the 
fatter and slower ones (the quotation marks under- 
score the fact that whenever two, or possibly more, 
solitons get together, their individuality is in fact 
lost: for a while the solution might have just one 
peak, or instead the *overtaking" of two solitons 
may rather appear as an “exchange of identity", 
with the taller soliton becoming fatter and the fatter 
becoming taller as they get close together until they 
separate again because the one in front, having 
become taller, speeds up while the one behind, 
having become fatter, slows down). The final out- 
come is of course that the order of the solitons gets 
altogether reversed, with the taller and faster head- 
ing the escape to the right. The most remarkable 
aspect of this phenomenology is that precisely the 
same solitons that existed in the remote past are 
found in the remote future, the only effect of their 
“interaction” having been to shift the position of the 
n-th soliton, relative to what it would have been if it 
had been moving in isolation, by the amount 


A, = en a &-^ 


These N shifts are moreover determined (while 
either the N quantities e or the N quantities a 
can be arbitrarily assigned), being given by the 


simple rule 


n— 


N 
An = >. A(Pn, Pm) = x. A(Pn, Pm) 


m-1 m=n+ 1 


[44a] 
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Of course in (44a) a sum vanishes if its lower limit 
exceeds its upper limit. 

This formula (44), has a simple phenomenological 
significance. From the two-soliton case (N= 2) it is 
seen that in a two-body encounter the taller and 
faster soliton gets advanced by the amount 
A(p2,~1), while the slower and fatter one gets 
delayed by the amount A(pi, p2). Hence the overall 
shift (44) experienced by the n-th soliton in the 
N-soliton case is the sum of the  — 1 positive shifts 
derived from its “overtaking” n — 1 slower solitons 
and the N — n negative shifts derived from its being 
“overtaken” by N — z faster solitons. This outcome 
is obvious when each two-soliton encounter occurs 
separately, but is quite nontrivial in the general case 
when, at some intermediate time, several solitons 
might all encounter simultaneously. 

This soliton phenomenology strongly suggest 
ascribing to each soliton an individuality, even 
though in configuration space it only shows up as 
a separate entity in the remote past and future. The 
separated identity of each soliton is instead quite 
clear in the spectral transform context, since each of 
them corresponds to a (time-independent) discrete 
eigenvalue of the spectral problem. Indeed in the 
spectral context this identity is clear also for the 
generic solution of the class of integrable nonlinear 
PDEs (33) which, in contrast to the purely solitonic 
solution (43), is mot characterized by a vanishing 
reflection coefficient R(k,t). And indeed, even in 
configuration space, the soliton phenomenology 
described above is still featured by a generic solution 
(each of which is characterized, via its spectral 
transform (37), by the number N of its solitons), up 
to the additional presence of a “background” 
component of this solution (corresponding to the 
nonvanishing reflection coefficient R(k,t)), which 
however behaves in a manner analogous to the 
solution of the linear, dispersive part of the PDE 
under consideration, becoming eventually locally 
small due to its dispersive character. 


[44b] 


Kinks, breathers, boomerons and  trappons, 
dromions The solitonic phenomenology described 
above for the class of integrable PDEs (33), and in 
particular for the KdV equation (35), is more or less 
common to all integrable nonlinear evolution PDEs — 
of which many other classes exist besides (33). But 
there also are some significant differences, some of 
which we now review tersely. 

For certain integrable PDEs the typical shape of 
the soliton is not localized, but it rather has the form 


of a “kink”. Some integrable PDEs also feature 
additional kinds of localized “solitons” which, in 
isolation, move overall with constant speed as 
ordinary solitons, but feature in addition a time- 
dependent amplitude modulation and are therefore 
called “breathers”. For integrable matrix nonlinear 
evolution PDEs — or, equivalently, for integrable 
systems of coupled PDEs — the new phenomenology 
may emerge of solitons that, even in isolation, move 
with a variable speed, the change of which over 
time is correlated with the variable interplay of 
the amplitudes of the different components of the 
solution: typically such solitons come in from one 
side in the remote past and boomerang back to that 
side in the remote future (“boomerons”), or they 
may be trapped to oscillate around some fixed 
position (“trappons”); and there are integrable 
evolution equations in which both these types of 
solitons are simultaneously present in a generic 
solution. All these phenomenologies refer to the 
simpler class of integrable evolution PDEs in 1+1 
(one space and one time) variables, with asympto- 
tically vanishing boundary conditions (at large space 
distances; or perhaps asymptotically constant, as in 
the case of kinks). There also exist integrable 
evolution PDEs in 2+ 1 dimensions (such as the 
KP equation (9)) the generic solution of which may 
feature localized soliton-like components, although 
in this case appropriate boundary conditions play a 
crucial role (for this reason such solitons have been 
called *dromions", hinting at their being to some 
extent driven by the boundary conditions, as objects 
moving in a stadium). 

While there are quite many (classes of) integrable 
PDEs in 1+ 1 dimensions, there are only a few in 
2+ 1 dimensions, and there is a widespread belief 
that no integrable PDEs exist in D + 1 dimensions 
with D > 2. But already in the early days of soliton 
theory it was pointed out that there do exist quite 
many (classes of) integrable PDEs in 1 4- D dimen- 
sions (namely, one space and D time variables) and 
that it is quite possible via a different formulation of 
the initial-value problem to interpret such equations 
as (no less integrable) PDEs in D + 1 dimensions (D 
space and one time variables); and integrable PDEs 
in D +1 dimensions have also been identified and 
investigated in the context of (the simpler class of) 
C-integrable PDEs (see below). 


Other properties of integrable PDEs 


For the linear evolution equations (30) the main 
message implied by their solvability via the Fourier 
transform is, that the time-evolution is much simpler 
in Fourier space (see (32)) than in configuration 


space. This has a profound impact on the under- 
standing of all phenomena describable by such 
equations, to the extent of determining the kind of 
experimental tools better suited to understand the 
underlining physics (for instance, the use of mono- 
chromatic beams of light, the use of high-energy 
particle accelerators, and so on). The same kind of 
message is as well relevant for the class of integrable 
nonlinear PDEs solvable via the spectral transform 
technique — even more so inasmuch as the time- 
evolution is in this case so much simpler in the 
spectral space (being actually linear there, see (38) 
and (39)) than in configuration space (where the 
evolution is nonlinear, see (33)). It is indeed the 
basis for the possession by the class of integrable 
nonlinear PDEs (33) of several other remarkable 
properties as outlined tersely in the following 
subsections. 


Backlund transformations A Backlund transforma- 
tion is a formula relating two functions, say u” (x, t) 
and ul (x,t), so that, if one of them satisfies a 
(generally nonlinear) PDE, the other one satisfies the 
same PDE. In the context of the class (33) of 
integrable PDEs, such a (class of) Backlund trans- 
formations is provided by the formula 


g(A) Iu 9 (x. p) —u P (xt) --b(A)T1—0 [45] 


where g(z) and f(z) are two (a priori arbitrary) 
entire functions (say, two polynomials), while A and 
I are two integrodifferential operators the effect of 
which on a function f(x,£) (such that all relevant 
integrations are convergent) reads 


If(x,t)— Du (x,t) 4- ul") (x,t) | F(t) 


E [uO (x, r = u (x,t)| 


x | E u(y, t) — uo (y,t) f (yt) [46a] 


Af (x,t) — f(x, t) - 2 fw (x. t) +u” (x, 2n (2,0 


m 
+r dyfot) 46b] 
Jx 

Note that here the variable t plays no relevant role 
(its presence is merely parametric), and that Il and A 
depend (in a symmetrical way) on uÜ(x,t) and 
u'l) (x,t), whose presence causes the Backlund 
transformation (45) to be nonlinear in these 
functions. Also important is the observation that, 
for u'V(x,t) =u") (x,t) -u(x,t), the operator A 
becomes the recursion operator R, see (34). 
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The reason why the formulas (45) constitute a 
class of Backlund transformations is because — as a 
property of the spectral transform based on the 
linear Schrödinger operator L, see (36) — if two 
“potentials” u(x,t) and ul (x,t) are related by 
(45), the corresponding “reflection coefficients” 
R)(k,t) and R'!(k,t) are related algebraically, as 
follows: 


g(—4k?) [RO (k, t) — RV (R, 2] 


+ 2ikh(—4k?) RO (k, t) + RY (k, 2] —0 [47a] 
entailing 
| AL2 bí AE 


g( —4k?) — 2ikb(—4k?) 


Clearly this formula entails that, if R‘°)(k,t) satisfies 
(38a), so does R‘')(k,t). Hence, as the fact that 
R9 (k, t) satisfies (38a) is a consequence of the fact 
that u(x,t) satisfies (33), likewise the fact that 
R‘(k,t) satisfies (38a) provides the basis for 
concluding that u“) (x,t) also satisfies (33). 

The simpler version of the Backlund transforma- 
tion (45) obtains by setting g(z) — —2ph(z) with p 
an arbitrary constant, hence it reads 


w) (x, t) + wt (x, t) 
= 2p w(x, t) —w") (x, 7] 
1 2 
E [w(x t) — wt (x, 7] |48] 


Here and below we use for convenience the 
functions w0 (x, t) related to u” (x,t) as follows: 


w) (x, t) af dyu” (y, t), 
x 


w (x, t) = — uU (x, t) 


[49] 


A convenient application of Backlund transfor- 
mations is to yield new solutions of (33) from 
known solutions; for instance from the trivial 
solution 4) (x,t) =w)(x,t)=0 the single-soliton 
solution (40) can be readily obtained via (48) and 
(49) (of course an appropriate time-dependence 
must be attributed to the x-independent “integra- 
tion constant” that obtains from the integration 
of (48), which is an ODE in the independent 
variable x). 

Another important property of Backlund trans- 
formations is their commutativity. Consider two sets 
of two polynomials, g™ (z) and 5b" (z), m= 1, 2, and 
the two Backlund transformations (45) they gener- 
ate, say BT1 and BT2. Take as starting point some 
function u) (x) and associate to it two functions, 
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u!) (x) respectively 4? (x), obtained from u(x) via 
these two Backlund transformations, BT1 respec- 
tively BT2. Then obtain a new function, say u“? (x), 
from 4! (x) via BT2; and likewise obtain u^ (x) 
from u?) (x) via BT1. The property of commutativ- 
ity entails that, provided an appropriate choice is 
made of integration constants (see (45)), 


u” (x)= ur (x) [50] 


This property is highly nontrivial when viewed, as 
we just did, in configuration space; it is instead 
rather obvious in the spectral space, indeed the 
corresponding property for the “reflection coeffi- 
cients" reads (in self-evident notation, see (47b)) 


RUD (k) = ROY (k) = RO (k) BY (R) B® (k) — [51a] 
(m)( A p2 bm). .A b2 
BO) (Ry — 8 (74 K^) + 2ikh™ (—4 kt) 
g(") (—4 k?) — 2ikh™) (—4 k?) 
m=1,2 [51b] 


hence it corresponds simply to the commutativity of 
the ordinary product. 


Nonlinear superposition principle Another 
remarkable property of the class of evolution 
equations (33) is a straightforward consequence of 
the commutativity property, (50), of Bäcklund 
transformations. It reads (hereafter with a slight 
abuse of language we refer to “solutions” w0) even 
though the actual solutions are the functions uV’ 
related to the w by (49)) 


pp. Sepala = un] 


(T2) ,,,21) _ 
Ww = W = w 
2(pi — p2) + wl) — we 


[52] 


—À 


where 1) =w) (x,t) is an arbitrary solution of 
(33), w =w" (x,t) respectively wP) =w% (x,t) 
are likewise the solutions of the same PDE related 
to w\) by the Backlund transformation (48) with 
p=p respectively p — po, and w?) (x,t) =w) (x,t) 
is another solution of the same PDE. Note that this 
formula, for which the title of this subsection seems 
appropriate, provides a completely explicit, rational 
expression of a new solution of (33) in terms of 
three other solutions of the same equation: an 
arbitrary solution w'°), and the two solutions iw" 
and w) related to it by a simple Backlund 
transformation, see (48). 


Soliton ladder A simple application of the preced- 
ing formula is to start from the trivial solution 


w) — 0 [53] 


so that (see (48)) 
w (x,t) = —2p; E — tan{ p; E — x. i a(Ap?)t |. 
ist) [54a] 


where, in order that this function be real, either 


Im Ra =Ü [54b] 
or 
Im xj | = is [54c] 


Via (49), the expression (54a) with (54b) yields, for 
each value of j, a version of the single-soliton 
solution (40). Insertion, of (53) and (54a) in (52) 
yields, via (49), the two-soliton solution of (33), 
provided 0 < p, < p» and xb. satisfies (54b) while 
xe) satisfies (54c) (otherwise the solution produced 
by (52) is complex or singular). 

Having thus obtained the two-soliton solution, 
one can apply the nonlinear superposition formula 
(52) to get the three-soliton solutions, by inserting in 
place of w® the single-soliton expression (54a) 
(with parameter, say, pı) and in place of ww! and 
w?) the two-soliton expression (with parameters pı 
and p»; respectively pı and p3); and the process can 
be continued, as suggested by the title of this 
subsection. In this manner the multisolitonic solu- 
tion can be constructed by a sequence of purely 
algebraic operations: and simple rules can be given, 
detailing the restrictions on the soliton parameter p, 
and the reality properties of the constants x» ((54b) 
or (54c)) to insure that the solution so arrived at be 
real and nonsingular, and thus coincide with (43). 


Conservation laws As mentioned above, integrable 
evolution PDEs are interpretable as infinite- 
dimensional dynamical systems. It is therefore 
natural that they possess an infinite number of 
conserved quantities. For instance every PDE of the 
class [33] possesses the following infinite sequence 
of conserved quantities: 


_ ty 
2n --1 
meld hes 


n | | dx R” [xus (x, t) + 2u(x, t)], 
[55a] 


where R is the recursion operator (34). An alter- 
native definition for this sequence is 


(—1)” f < H 
nti} dxR u(x,t), 


H0, 3S iss: 


"n 


[5b] 


where the integrodifferential operator R is in some 
sense the adjoint of R, being defined by the 
formula 


~ 


Rf (x,t) = fex (x,t) — Aux, Df xs) 
+2 f dyu ifyo) 


x 


[55c] 


that specifies its action on a generic function f(x, t) 
(such that the integration converge). The first 3 of 
these conserved quantities read as follows: 


Co =] dx u(x, t), 
a = f dx u* (x, t), 
C, = ] | dx |2 w^ (x, t) + ui (x, t) | 


These constants of the motion (55) are functionally 
independent and, in the context of a Hamiltonian 
formulation characterized by the Poisson bracket 


* 6A 8 6B 
ESSI J. W Fins ix a 


(where A and B are functionals of u(x) and 6/6 u(x) 
denotes the functional derivative), they are in 
involution, 


1C»; Cm} Hu 


Note that, in this context, the KdV PDE (35) 
coincides with the Hamiltonian equation 


o ó H 
u;(x, t) T [u(x, t), HY = (=) b u(x, t) 
with 
1 1 iii 3 2 
H -3€0 =5/ dx [2u (sc, £) + uz(x,t)| 


Several alternative sequences of constants of 
motion also exist. For instance another infinite 
sequence is provided by the two equivalent formulas 


C, = cy | dx R^ -1 [56] 


"ET. i dc AB ule, t [56b 


with the integrodifferential operators R and Ao 
defined by the formulas 
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Rf (x, t) = felt, t) = [ dy u(y, t) f(y, i 
Aof(x,t) =fex(x,t) — 2u(x,t)f(x,t) 
wt) | dyfot) 


+ u(x,t) [ 4 u(y, t) [ «fe t) 


y 


Note that the integrodifferential operator Ap is just 
A, see (46), with u(x,t) =0 and u”) (x, t) — u(x, t). 
The constants c, are also all independent of each 
other, but there is a relationship between the 
constants of the two sequences, (55) and (56), 


`O Cn gent = sin x de ve 
n—0 n—0 


which is to be understood by expanding the right- 
hand side in powers of z and then equating the 
coefficients of equal powers of z: 


co = Co, 
1 
C1 = Ci -1 Cå, 


a= C2 — iC$Ci + 355 CQ 


and so on. 

Of course all these conservation laws are applic- 
able to the class of solutions of (33) defined for all 
(real) values of x and vanishing asymptotically (as 
x — +00). But they can also be reformulated as local 
“continuity equations". And — rather remarkably — 
all these results hold as well for the explicitly time- 
dependent class of PDEs that obtains if one allows 
the polynomial a(z) in the right-hand side of (33) to 
feature an arbitrary time-dependence, say 


M 


a(z,t) = >, d [57] 


m=0 


Finally let us note that there is an additional 
conserved quantity for this (generalized) class of 
PDEs, 


oo t 
C= / dx L t) + fi dt'a(R, t')u(x, t) 
—Oo 0 


with R defined by (55c). This implies that, for the 
generic solution of this (generalized) class of PDEs 
the center of mass 


B [5 dx x u(x, t) 


Ait) = I dx u(x, t) 
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moves according to the formula 


E 
Xf) — Xa + rS +1(Z) 
(t) =Xo »» "mrt 
t 
G 
x dtes (t Y, X9 =— 
0 m(t ), Xo Co 


Hence for all the autonomous evolution PDEs of the 
class (33) (with a(z,t)=a(z), Q(t) - a,,, see (57)) 
the center of mass of the generic solution moves 
uniformly, 


X(t) = X 4- Vt 


with the (constant) speed 


M m-4-1 Cm 
V 25 (-1)"* (2m 4+ )(Z)on 


m=0 


Other techniques to identify, classify 
and investigate integrable PDEs 


The spectral transform approach on which we 
focussed above is just one of the various techniques 
used to identify and investigate integrable nonlinear 
evolution PDEs. (Incidentally; because the less 
standard aspect of this approach is the inverse 
transformation to reconstruct, in the framework of 
the spectral problem, the “potential” u(x) from its 
spectral transform, this approach is often called the 
Inverse Spectral, or Scattering, Transform method — 
abbreviated as IST). In this subsection we tersely 
mention some other approaches, referring to the 
literature indicated below for more adequate 
treatments. 

An approach starts from a trivially integrable 
PDE - say, linear and autonomous, see for instance 
(30) — and performs a nonlinear change of 
dependent, and possibly as well of independent, 
variables. The PDE thus obtained is generally 
integrable, indeed the term C-integrable is used to 
denote such equations (to distinguish them from 
the S-integrable equations solvable via IST: the 
letter C refers to the Change of variables, the letter 
S to the Spectral, or Scattering, transform). A 
simple instance of C-integrable equations is the 
Burgers equation (5), which is linearized via the 
change of dependent variable 


d.t) =ax,t)exp|~ f dyab) 


q(x, t) 


SS TF d» 


entailing the linear PDE 
qt T Axx = 0 
A second example is the “Liouville equation” 


Ux, = exp(u) 


[58a] 


or equivalently, in “light-cone coordinates" (£ — x + t, 
T-—-—X-t) 
Ur, — Uge = exp(u) 


58b] 


the general solution of which reads 
u(x,t) - (6) - git) - 2log fa f de’ explf(x’) 


t (24) [ dt' exp(-et^] 


with f(x) and g(t) arbitrary functions and xo,to, a 
arbitrary constants. And a third example is the 
Eckhaus equation 


q ias (la) -la]ab — [59 
which is linearized by the transformation 
1) =a(x.t)exp| | dyon 
q(x, t) 
1+2 J* o dyl t) 


entailing the linear PDE 


q(x, t) = 


qt = Idqxx 


Thanks to the simplicity of the technique to 
solve them, C-integrable PDEs provide a conveni- 
ent tool to investigate the phenomenology asso- 
ciated with nonlinear PDEs. For instance the 
Burgers equation (5), which possesses kink-like 
solitons, is a simple nonlinear generalization of the 
heat equation; and the “relativistic invariance” of 
the Liouville equation, see (58b), makes it a 
convenient “toy model” in the context of relati- 
vistic field theory. The Eckhaus equation, (59), 
provides an interesting theoretical tool because of 
its similarity with the phenomenologically impor- 
tant NLS equation (6), as well as the fact that, 
thanks to its C-integrability, the structure of its 
solutions — which feature a remarkable solitonic 
zoology, including the possibility of “anelastic” 
solitonic reactions — can be studied in considerable 
detail, entailing an understanding of why such 
anelastic reactions are unlikely to be featured by 
solutions obtained in the context of the initial- 
value problem. 
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C-integrable PDEs are generally as well S-integrable, 
being generally associable with a spectral problem that 
can be explicitly solved; the converse, instead, is not 
generally true. Hence C-integrability represents a 
higher level of integrability than S-integrability; a 
ranking that is quite useful in spite of its lack of strict 
cogency caused by the possibility to consider also the 
transformation from a function to its spectral trans- 
form as a change of (dependent) variable. 

The Lax approach, described in some detail above 
in the context of finite-dimensional integrable 
dynamical systems, was in fact originally invented 
in the context of integrable PDEs. For instance the 
KdV equation (35) corresponds to the (operator) 
Lax equation (to be compared with the matrix Lax 
equation (14)) 


L, = [L. M] 


where now the Schrödinger operator L is defined by 
(36) (so that L; =m(x,t)) and the operator M is 
defined as follows: 


M = —4 ka pr t) 4 + 3ux(x, t) 
, Ox "gx t 
Closely connected with this approach is the AKNS 
method (due to M. J. Ablowitz, D. J. Kaup, A. C. 
Newell and H. Segur), based on the observation that 
the KdV equation (35) coincides with the integr- 
ability condition 


lxxi = cs [60] 


for the following pair of linear PDEs (the first of 
which is just the eigenvalue equation for the 
Schrödinger operator L, see (36)) satisfied by the 
function w(x, b, t) : 


Ux = [u(x, t) — k^]v [61a] 
Wt = |—ux(x, t) + 4ik?| (uL 
+2 [u(x, t) 4- 2 k^] v, [61b] 


and, more generally, that every equation of the 
class (33) coincides with the integrability condition 
(60) for the eigenvalue equation (6la) and the 
equation 


W, = a(x,k, t) w+ blæ, k, t) wx [61c] 


with an appropriate choice of the two functions 
a(x,k,t) and b(x,k,t). Indeed this ansatz, (61c), 
with a(x, k,t) and b(x,k,t) low-order polynomials 
in k, provides a quite straightforward technique to 
identify the simpler equations of the class (33); ditto 


for the extension of this approach based on more 
general eigenvalue problems than (61a). 

Another powerful approach suitable to identify 
and investigate integrable PDEs is the so-called 
“dressing method” (introduced by V. E. Zakharov 
and A. B. Shabat and pursued by many others), in 
which one starts again (as in the approach leading to 
C-integrable equations) from an easily solvable 
evolution equation and then performs transforma- 
tions (less elementary than just a change of 
variables) that modify (“dress”) the original equa- 
tion, obtaining thereby new (nontrivial and interest- 
ing) evolution equations, the integrability of which 
hinges on the control one has on the (dressing) 
transformation relating (both ways) the solutions of 
the new equations with those of the original 
equation. Of course many specific techniques are 
accommodated within this (admittedly vague) 
description; we must confine our remarks here to 
noting the crucial role that the Riemann-Hilbert 
problem generally plays in this context (indeed the 
Riemann-Hilbert problem also lies at the core of the 
solvability of the inverse spectral problem, although 
techniques not explicitly relying on it are also 
available). 

Algorithmic approaches, particularly suitable to 
manufacture multisolitonic solutions and to identify 
nonlinear PDEs that are integrable inasmuch as they 
feature such solutions, were developed already at the 
beginning of the 70's. The pioneer of this approach 
was R. Hirota; less than a decade later a 
more sophisticated and general development - the 
so-called “tau-function” method - was invented 
by M. Sato and his pupils/collaborators. 

Finally let us mention that many remarkable 
connections exist among integrable PDEs and 
integrable finite-dimensional dynamical systems 
such as those discussed above; for instance the 
time-evolution (taking generally place in the com- 
plex plane) of the poles of rational solutions of 
certain integrable PDEs obey the equations of 
motion of integrable dynamical systems interpreta- 
ble as many-body problems. 


Why are certain nonlinear PDEs both integrable 
and widely applicable? 


Several integrable PDEs play a key role in various 
applicative contexts, justifying the question figuring 
as title of this subsection. A metamathematical but 
enlightening, and heuristically quite useful, reply to 
this question reads as follows. 

Consider as starting point a large class of non- 
linear PDEs, and associate to it via some kind of 
asymptotic limit procedure a single nonlinear 
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PDE - to which it is then justified to attribute a 
certain universal character. If this procedure corre- 
sponds to a physically (or, more generally, applica- 
tively) significant limit, it stands to reason that this 
universal PDE play a role in several applicative 
contexts (because the original class of PDEs, being 
large, certainly contains several equations of appli- 
cative relevance). And if the limit procedure is in 
some sense asymptotically exact, and it therefore 
preserves the property of integrability, it is also 
likely that this universal PDE be integrable, because 
for this it is sufficient that the original, large class of 
PDEs contain just one integrable PDE. 

For instance most phenomena characterized by a 
dominant dispersive plane wave in a weakly non- 
linear context can be shown, via an asymptotically 
exact multiscale expansion, to be modeled by the 
Nonlinear Schroedinger equation (6), the solution of 
which provides then the evolution, in appropriately 
rescaled “slow” and “coarse-grained” time and 
space variables, of the amplitude modulation of the 
dominant dispersive wave. This explains why this 
nonlinear PDE plays a key role in so many, disparate 
applicative contexts, and it also implies, in the light 
of the above argument, its integrability. 

The reasoning outlined above is quite robust, 
and it allows to infer that, if instead the universal 
limit equation is not integrable, then the large class 
of PDEs from which it originates cannot contain 
any integrable equation, providing thereby the 
point of departure to obtain (quite useful) zeces- 
sary conditions for integrability. Indeed these 
conditions are adequate to distinguish among 
different levels of integrability, for instance among 
C-integrability and _ S-integrability; with the 
Eckhaus equation (59) playing in this context a 
somewhat analogous role for C-integrable PDEs to 
that played by the Nonlinear Schrödinger equation 
(6) tor S-integrable PDEs. 


Outlook 


Many more important developments than could be 
covered in this overview have occurred in the last 
few decades; for these we refer to the books listed 
below (and there are many more), and to the 
literature cited there. 

Let us end this entry by emphasizing that both the 
study of integrable systems, and its application to 
phenomenologically interesting situation — including 
technological innovations, for instance in nonlinear 
optics and telecommunications — are still in the 
forefront of current research; although perhaps the 
“heroic era” of this field of study is over. 


See also: Abelian Higgs Vortices; Backlund 
Transformations; Bethe Ansatz; Bifurcations of Periodic 
Orbits; Bi-Hamiltonian Methods in Soliton Theory; 
Billiards in Bounded Convex Domains; Boundary-Value 
Problems for Integrable Equations; Breaking Water 
Waves; Calogero—Moser—Sutherland Systems of 
Nonrelativistic and Relativistic Type; Cauchy Problem for 
Burgers-type Equations; Cellular Automata; Classical 
r-Matrices, Lie Bialgebras, and Poisson Lie Groups; 
O-Approach to Integrable Systems; Einstein Equations: 
Exact Solutions; Functional Equations and Integrable 
Systems; Ginzburg-Landau Equation; Hamiltonian 
Systems: Obstructions to Integrability; Holonomic 
Quantum Fields; Instantons: Topological Aspects; 
Integrability and Quantum Field theory; Integrable 
Discrete Systems; Integrable Systems and Algebraic 
Geometry; Integrable Systems and Discrete Geometry; 
Integrable Systems and the Inverse Scattering Method; 
Integrable Systems in Random Matrix Theory; Inverse 
Problem in Classical Mechanics; Isochronous Systems; 
Isomonodromic Deformations; Integrable Systems and 
Recursion Operators on Symplectic and Jacobi 
Manifolds; Korteweg-de Vries Equation and Other 
Modulation Equations; Multi-Hamiltonian Systems; 
Nonlinear Schródinger Equations; Ordinary Special 
Functions; Painlevé Equations; Peakons; q-Special 
Functions; Quantum Calogero-Moser Systems; 
Quantum n-Body Problem; Random Matrix Theory in 
Physics; Recursion Operators in Classical Mechanics; 
Riemann-Hilbert Methods in Integrable Systems; 
Riemann-Hilbert Problem; Separation of Variables for 
Differential Equations; Sine-Gordon Equation; Solitons 
and Kac-Moody Lie Algebras; Solitons and Other 
Extended Field Configurations; Twistors; Toda Lattices; 
Vortex Dynamics; WDVV Equations and Frobenius 
Manifolds; Yang-Baxter Equations. 


Further Reading 


Ablowitz M] and Clarkson PA (1991) Solitons, Nonlinear 
Evolution Equations and Inverse Scattering. Cambridge: 
Cambridge University Press. 

Ablowitz M] and Segur H (1981) Solitons and tbe Inverse 
Scattering Transform. Philadelphia: SIAM. 

Babelon O, Bernard D, and Talon M (2003) Introduction to 
Classical Integrable Systems. Cambridge: Cambridge Univer- 
sity Press. 

Bullough RK and Caudrey P] (eds.) (1980) Solitons. Heisenberg: 
Springer. 

Calogero F (ed.) (1978) Nonlinear Evolution Equations Solvable 
by tbe Spectral Transform. London: Pitman. 

Calogero F (2001) Classical Many-Body Problems Amenable to 
Exact Treatments. Heidelberg: Springer. 

Calogero F and Degasperis A (1982) Spectral Transform and 
Solitons. I. Amsterdam: North Holland. 

Dodd RK, Eilbeck JC, Gibbon JD, and Morris HC (1982) 
Solitons and Non-linear Wave Equations. New York: Aca- 
demic Press. 

Faddeev LD and Takhtajan LA (1987) Hamiltonian Methods in 
the Theory of Solitons. Heidelberg: Springer. 


Interacting Particle Systems and Hydrodynamic Equations 123 


Hoppe J (1992) Lectures on Integrable Systems. Heidelberg: 
Springer. 

Konopelchenko GB (1987) Nonlinear Integrable Equations. 
Heidelberg: Springer. 

Novikov SP, Manakov SV, Pitaevskii LP, and Zakharov VE 
(1984) Theory of Solitons: the Inverse Scattering Method. 
New York: Plenum Press. 

Moser J (ed.) (1975) Dynamical Systems, Theory and Applica- 
tions. Heidelberg: Springer. 


Moser J (1981) Integrable Hamiltonian Systems and Spectral 
Theory. Pisa: Scuola Normale Superiore. 

Perelomov AM (1990) Integrable Systems of Classical Mechanics 
and Lie Algebras. Basel: Birkhauser. 

Toda M (1981) Theory of Nonlinear Lattices. Heidelberg: Springer. 

van Diejen JF and Vinet L (eds.) (2000) Calogero-Moser-Suther- 
land Models. Heidelberg: Springer. 

Zakharov VE (ed.) (1991), What is Integrability?. Heidelberg: 
Springer. 


Interacting Particle Systems and Hydrodynamic Equations 


[ C Landim, IMPA, Rio de Janeiro, Brazil 
. and UMR 60885, Université de Rouen, France 


: © 2006 Elsevier Ltd. All rights reserved. 


Introduction 


We present the theory of hydrodynamic behavior of 
interacting particle systems in the context of exclu- 
sion processes, in which no more than one particle 
per site is allowed. 

Denote by Tn = Z/NZ the discrete torus with N 
poinw and let T£ = TAY. The state space Ex = 
{0,1}'% consists of all configurations obtained by 
distributing particles on the discrete torus T4, respect- 
ing the exclusion rule which prevents more than one 
particle per site. The configurations are denoted by the 
Greek letter 7 so that n(x) is equal to O or 1 if site 
x€ TŻ, is vacant or ocoupieg for the configuration n). 

Dear: by {7% : x € Zf} the group of translations 
in En: (rym(z) 2 n(x +2) n each x, z in 74. Here 
and below summations are performed modulo N. A 
function f : (0, 1} — IR with finite support is called 
a cylinder function. 

Fix a family of non-negative cylinder functions 
cj, 1 € j € d. Let cy 444, (1) =c(7n) and consider the 
Markov process {m :t > 0} on En with generator Ly 
given by 


(LNf)( )(n) 2» y Cx. x--e; (7)| 


m L xeT*, 


[/(c^***«)-f(m) [1] 


Here, lets: ,e,} stands for the canonical basis of R? 
and og or the configuration obtained from 7j by 
exchanging the occupation variables n(x) and rj(y): 


nz) ifzx,y 
n(y ifz=x [2] 
n(x) ifz=y 


In this dynamics at each bond {x,x+ ej} the 
occupation variables n(x), n(x +e;) are exchanged 
at rate Cx x+e;(n). This happens simultaneously and 
independently at each bond. 


Notice that the total number of particles is 
conserved by the dynamics since only exchanges are 
allowed. Denote by Xx. «(0 < K < [T£|) the hyper- 
plane of all configurations ņ of En with K particles. 
Assume that the rates c; are nondegenerate for rj, to 
be an irreducible Markov process on each Xy x. 

For 0<a<1, denote by v the Sermon 
product measure of parameter o on Ex. Under yN 
the variables (n(x), x € Td N} are independent, with 
marginals given by 


va (n(x) = 1} =a =1 — và nx) 


Assume that the measures v, 0 < a < 1 are station- 
ary for the Markov process m. An elementary 
computation shows that this is the case if each function 
c; does not depend on 7(0), n(e;), in which case the 
process is in fact reversible with respect to v^. 

Let ML(T4) be the space of finite positive 
measures on the torus T^ endowed with the 
weak topology. For each configuration 7, let 
tN —aN(n, du) be the positive measure on T? 
obtained by assigning mass N^ to each particle: 


aN =N N? X` n(x) 


xeT4, 


= 0} 


Jéx;N( (du) [3] 


where 6, stands for the Dirac measure on u. The 
measure z^ is called the empirical measure asso- 
ciated to the configuration 7. The integral of a 
continuous function G:'T^ — R with respect to mN 
is denoted by 
(x^, G) "a * G(x/N)n(x 


xe Ti 


Fix a density profile po :'I4 > 
of probability measures u^ 
associated to po if aN 
po(u)du under j^: 


lim j^ | 
Noo 


[0,1]. A sequence 
on En is said to be 
converges in probability to 
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for all continuous functions G : T? — R and all 6 > 0. 
For a continuous profile po consider, for instance, the 
product measure V (j on En whose marginals are 
given by 


V cy (n(x) = 1} = po(x/N) 


It is easy to check that the sequence of probability 
measures V. į is associated to po. 

Denote by W,.,x+¢, the instantaneous current of 
particles from x to x + e;. This is the rate at which a 
particle jumps from x to x-- e; minus the rate at 


which a particle jumps from x + e; to x: 
We, xte, = {n(x) — n(x + 6j) }ex xe, (71) 


Suppose that the mean value of the current vanishes 
under all stationary states v. This denotes that the 
average displacement of each particle vanishes in the 
mean. In particular, in view of the central limit 
theorem, to observe an evolution of the density in 
the macroscopic scale, a diffusive rescaling of time is 
needed. On the other hand, if there is a net flux of 
particles, the evolution has to be examined in the 
Euler scale £N. 

Denote by 6(N) the time rescaling: N? if the mean 
displacement of particles vanishes and N otherwise. 
For each probability measure ju" on £y, let P, be 
the probability measure on the path space 
D(R,, En) induced by u and the Markov process 
nm, speeded up by @(N). Expectation with respect to 
P, is denoted by E. 

Denote by mN(du)=aN (mon), du) the empirical 
measure at time t. Fix a density profile po : T4 — [0, 1] 
and a sequence of probability measures 4" on 
En associated to po. The goal of the theory of 
hydrodynamic limit of interacting particle systems is to 
show that for each t > 0, x" converges, as N T oo, to 
a deterministic path z(t, du) = p(t, u)du whose density 
p is the solution of some partial differential equation, 
called the hydrodynamic equation. 

The main tools available are entropy production 
and Dirichlet forms. Denote by Hw(u"|vN) the 
entropy of a probability measure u on £y with 
respect to a reference probability measure v: 


Hyn(u™ |v^) 2 sup u ; du" — log | ef w^ 
f €N EN 


where the supremum is carried over all functions 
f:En >R. 

It follows from the general theory of Markov 
processes that the entropy of the state of the process 
with respect to an invariant state decreases in time. 
The rate at which the entropy production decreases 
can be estimated by the Dirichlet form: let S be the 


semigroup associated to the generator Ly defined in 
[1] speeded up by 0(N). An elementary computation 
gives that 


t 
Hy (uS) - 20) | ds IS (NSX) 
0 
< Hn (u^ |v) 


Here, I"(uN) is the convex and lower semiconti- 
nuous functional given by 


EY (ui) = —q Luft) a 


where f stands for the Radon-Nikodym derivative 
du/dy and (-,-),» for the scalar product in 
L^), | 

Therefore, if the initial state u has entropy with 
respect to a reference measure v bounded by CoN?, 
by convexity of IN, 


N ŻHyN La Ss [ee ) 


t 
+ 2t0(N)N~4IN (r^ | ss) < Co [4 
0 


for all ż> 0. This elementary estimate plays a 
fundamental role in the following sections. 


The Entropy Method 


Consider an exclusion process with generator given 
by [1]. Fix T » 0, a density profile po: T^ — [0,1] 
and a sequence of probability measures u^ asso- 
ciated to po. Let O,« be the measure on the path 
space D([0, T], ML,CT4)) induced by the process m 
and the initial state pi. 

To prove that n converges to p(t,u)du in 
probability, we first show that the sequence Q, 
converges to the probability measure Q* concen- 
trated on the deterministic trajectory p(t,u)du, 
whose density is the solution of some partial 
differential equation with initial condition po. It 
follows from this result and general arguments that 
x" converges to p(t,u)du for each 0 < t € T. 

To prove that O,» converges to Q", assume that 
we are able to prove tightness of the sequence Qn. 
Since there is at most one particle per site, all limit 
points Q" of the sequence O,x are concentrated on 
trajectories m(t, du) = p(t, u)du, which are absolutely 
continuous with respect to Lebesgue. 

To characterize the limit points Q*, fix a smooth 
function G : T^ — R and consider the martingale 


MP = (m, G) — (19.6) 


- [ oor G) ds [5] 
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An elementary. computation of its quadratic variation 
shows that M®™ vanishes in L (Pn ) as N f oo. 

Denote by Co the space of cylinder functions 
which mane zero mean with respect to all invariant 
states vN. Assume that the currents Wo, ent SJ =a, 
belong to Cy so that a diffusive scaling 6(N) = N? is 
in force. Notice that 


d 


=) Waea 


j=l 


Lyn(x Wx, x-re, 

In particular, after a summation by parts, the 
integral term on the right-hand side of [5] can be 
written as 


NU (Vil H)(x/N) We, x+6 6 
[ AF (x/ (sds 6 
where (V5 H)(x/N) = H(x + e;/ N) — H(x/N)}. 


Notice that this sum is in "d of order N. 

To illustrate the entropy method, consider the 
symmetric simple exclusion process obtained by 
taking c; — 1/2 in [1] and observe that the current 
Wo, e; =(1/2){n(0) — n(e;)}. A second summation by 
parts permits to rewrite the martingale [5] as 


1 


t 
(r, G) - 5 | (nN, AnG) ds 


where Ay is the discrete Laplacian. 

Since the martingale M? vanishes in L*(P,u), 
as N Î oo, all limit points O' are concentrated on 
weak solutions of the linear heat equation. It remains 
to recall that there is a unique weak solution of the 
Cauchy problem for the heat equation to conclude 
that the sequence Q,« converges to Q', the 
measure concentrated on the deterministic path 
7; (du) = p(t,u)du whose density p is the solution of 
the heat equation with initial condition po. 

The symmetric simple exclusion process has the 
very special property that the martingale M@ can 
be written as a function of the empirical measure. 
This is not the case for all the other models, for 
which a further argument is needed to close eqn [5] 
in terms of the empirical measure. 

To present the additional arguments needed, 
assume that c;(ņn)=1 + [n(—e) 4-7(2ej)]. In this 
case, the current Wo,e, is equal to 


{n(0) — n(e;)} + {n(0)n(—e) — n(ej)n(2e;)} 
+ {n( aier hiap ae 


A second summation by parts in [6] permits to 
rewrite it as 


(mG) - 


NC 235» (3; H)(x/N)r«b(nas)ds--ow(1) [7] 


j=1 ! xe Td, 


where h(n) = (0) + 2n(0)n(—e;) —7(0)n(2e;). The 
remainder ox(1) appears because we replaced dis- 
crete space derivatives by continuous ones. 

In contrast with the symmetric simple exclusion 
process, the martingale M®™ defined in [5] is not a 
function of the empirical measure and an argument 
is needed to close the equation. 

For each positive integer / and d-dimensional 
integer x, denote by z'(x) the empirical density of 
particles in a box of length 2/ 4- 1 centered at x: 


f(x)  — —4 Y. n) 


(20+ 1) y 


For a cylinder function 5:£y — R, let h(a) be the 
expected value of h with respect to the invariant 
state v^ : h(a) = E,n|h(n)|. For £ > 1 and a cylinder 
function P, let 


arci zov 


ly|<é 


Vitn) = bey - ift) 


Theorem 1 Consider a sequence of probability 
measures mh on En such that IN (s) < CoN? for 
some 0 <a < 1 and some finite constant Co. Then, 


N~4 y Tx V-n(n) =0 


lim sup lim sup E, 
e—0 N—oo xe Té, 
This statement, due to Guo et al. (1988), permits 

the replacement of a local function b by a function 

of the density of particles over a macroscopic cube. 

It is the main step in the proof of the hydrodynamic 

behavior of gradient systems, defined below, and its 

proof can be found in Kipnis and Landim (1999, 

chapter 5). 

Assume that the sequence u has entropy with 
respect to a reference invariant state vN bounded by 
CoN4 for some finite constant Co. It follows from 
[4] that the sequence of measures T fy | ds SN 
satisfies the assumptions of Theorem 1. Therefore, 
due to the presence of the time integral, we may 
replace the cylinder function 5 in [7] by h(n®™(x)). 
Since 7 "(0) can be written as (2%,1-), where 
t = (2e) "1([-2,2]7), we now have expressed the 
martingale [5] in terms of the empirical measure. 

Repeating the arguments presented for the sym- 
metric simple exclusion process, we may conclude 
that all limit points Q* of the sequence OQ, are 
concentrated on paths 2;(du)=p(t,u)du, whose 
density p is a weak solution of the parabolic 
equation 


| Óp — A(p + p?) 
p(0, :) ps po(-) 
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because h(a) =a + o? for h(n) =7(0) + 25(0)n— e;)— 
n(O)n(2e;). It remains to show the uniqueness of 
weak solutions of this differential equation to 
conclude. 

The second integration by parts in [6] was possible 
because the currents could be written as the difference 
of local functions and their translations, a very special 
property not shared by most interacting particle 
systems. Processes with this attribute are called 
gradient systems. 


Nongradient Models 


Consider an exclusion process with rates c;(1) = 1 + 
n(—e;), in which case the current is given by 


Wo,e, = (0) — ne) } + {n(0) — 1(ej)))m(—ej) 


a cylinder function in Cp. 

Fix T > 0, a density profile po: T7 — [0,1] and a 
sequence of probability measures py associated to 
po and having entropy with respect to a reference 
invariant state yN bounded by CoN? for some finite 
constant Co. Recall the definition of the sequence of 
measures Q,x, assumed to be tight. 

To characterize the limit points of Quy fix a 
smooth function G:T?—R and examine the 
martingale MÍ^" introduced in [5]. After an 
integration. by parts, the integral term of the 
martingale becomes [6]. While a second integration 
by parts is possible for the first part of the current 
1(0) — n(e;), the second piece remains 


[ NIAY Y^ (VNH)(x/Nyrewi(nae)ds [8] 


j=l xe T7 


where :v;—[n(0) — n(e;)}n(—e;). Notice the extra 
factor N multiplying the sum and that w; belongs 
to Co. The next result and Theorem 4 are due to 
Varadhan (1994). 


Theorem 2 Consider a sequence of probability 
measures m on En such that Hw(m^|v^) < CoN’ 
for some 0 < a < 1 and some finite constant Co. Fix a 
smooth function G:T4 — R and a cylinder function 
V in Co. There exists a seminorm ||-||, such that 


" 2 
lim sup {a | / dsN!-4 + G (x/ N)rs V (2) | | 
N=% 0 


ETA 
2 2 
< CoT|[G|[» sup ||W|[; [9] 
0<a<1 


The explicit form of the seminorm ||-||, can be 
found in Kipnis and Landim (1999, chapter 7). The 
proof of Theorem 2 requires a sharp estimate on the 
spectral gap of the generator Ly. Denote by A, 


the cube [—£,...,4)^ and by La, the restriction of 
the generator Lx to the cube Ay, obtained by 
suppressing all jumps from A, (resp. Aj) to Aj 
(resp. Ay). For 0 € K € |A;|, let v4, x be the uniform 
measure on the configurations of (0,1) with K 
particles. The following estimate is needed in the 
proof of Theorem 2: 


Theorem 3 There exists a finite constant Co such 
that 


Cf). zS Col? "Pm Lay hus, x 


for all £ > 1,0 € K € |A;| and zero-mean function f 
in L* (VA, K). 


This result is due to Quastel (1992) for symmetric 
simple exclusion processes. Yau developed a general 
method to prove sharp estimates for the spectral gap 
of the generator for conservative dynamics (see Lu 
and Yau (1993) and Yau (1997)). 

Since the parallelogram identity is easy to check, 
by polarization we can define a semi-inner product 
K, >a from the seminorm ||.||,. Denote by Ha the 
Hilbert space induced by Co and the semi-inner 
product <-,->>,. 

Denote by L the generator [1] extended to Zf. 
Notice that Lf belongs to Co for any cylinder 
function f, and that the gradients 7(e;) — (0), and 
the currents w;,1 <j € d, also belong to Co. The 
next result states that all functions in Ha can be 
written as a linear combination of gradients and 
cylinder functions in the image of the generator. 


Theorem 4 Denote by LCo the space {Lg:g € Co}. 
For eacb 0 € a & 1, 


Ha = L 6 {n(e) — (0): 1 <j < d) 


In particular, there exists a matrix {Dj ;(a):1 < 
i,j <d} and a sequence of functions {fj ¢(a,-) € 
Co:k > 1,1 € i € d, for which 


d 
wi + X Dij(a)(n(ej) — n(0)} — Lf, la, -) 

j=] 
vanishes in Ha as k } oo. For reversible systems (and 
more generally for generators satisfying a sector 
condition), it can be shown that the sequence of 
local functions f; (&, n) can be taken independent of 
a: f; (0,7) — fi(n) Moreover, with a little extra 
effort, one obtains a bound uniform in o: 


wD ie) 


This estimate together with some algebraic relations 
in Ha give a variational formula for the matrix D; ;: 


0)}—Lf 


inf sup 


10 
fv 0<a< l | | 


In(e;) — ( 


C 
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for every vector v in RÍ, 


v : D(a)v = dica - ; iat 


3 Ui; — uf 
It can also be shown that the matrix D is continuous 
and strictly elliptic. 

We may now complete the proof of the hydro- 
dynamic behavior. Recall that the main difficulty 
was to express formula [8] in terms of the empirical 
measure. Fix 1 < ; € d and consider a sequence of 
cylinder functions {f; , : & > 1} satisfying [10] asymp- 
totically as koc. Adding and subtracting the 
expression MT D; ,(rF^ (0)) (r^ (ej) = grt) — 
Lf; k, [8] becomes the sum of three terms. 

The first one is just the expression which appears 


[11] 


inside the expectation in [9] with G — (V7 H) and V 
given by 
d 
wj 3 Dj (P^ (0)) {0 (ej) — n° (0)) — Lf, 


Since the sequence of measure u satisfies the 
assumptions of Theorem 2, a modification of the 
proof of this theorem, to take into account 
the dependence of V on N and e, shows that the limit 
of the expectation of the absolute value of the first 
term in the decomposition, as N 1 oo and thene | 0, 


is bounded by 
2 2 
Co TO, H|5 sup ||P; alla 
O<a<l 


where 


V, 7 wj Y Dj a){n(e) — n(0)} — Lf; 
R=] 


By [10], the penultimate expression vanishes as k T oc. 
The second term in the decomposition is 


d 
J asn" 1 Y^ Y7 (vH) (/NyssLf, a (nae) 
0 


jk= | xe T4 


The presence of the generator L and the diffusive 
rescaling of time permit to show that the expecta- 
tion of the absolute value of this expression is of 
order N for each fixed k. 

Finally, the third term is equal to 


u | - d N 
[aw XO (VNH)(x/N)D 


js k=1 xE pri 
x (ENa (x)) (riw (x + ex) — nix (x)) 


A second integration by parts is now possible and 
one obtains that the previous expression is equal to 


[oan i e - 
0 


j,k= Ta 


+ on(1) 


)(x/N) d j, kn (x )) 


where d;, —D;,. We have already seen in the 
derivation of the hydrodynamic equation for gradi- 
ent systems that this sum can be expressed as a 
function of the empirical measure. Since all limit 
points are concentrated on paths z;(du) which are 
— continuous, this integral converges to 


RI ds |. du( du (8 EN (u) d; (p(s, u)) 
j; b—1 0 P 


Since the martingale [5] vanishes, all limit points 
are concentrated on trajectories 7;(du)-— p(t, u)du 
which are weak solutions of 


d 
op = F Ou, 4 [6; e D; ¢(p)|Ou,P} 
j,k=1 


where D is the strictly elliptic and continuous matrix 
given by the variational formula [11]. Here, the 
identity matrix 6; , comes from the first piece of the 
current which permitted a second integration by 
parts. A uniqueness result of weak solutions of the 
Cauchy problem with initial condition pọ concludes 
the proof of the hydrodynamic behavior of this 
nongradient system. 


Hyperbolic Equations 


Consider the asymmetric simple exclusion process 
obtained by setting c;(7) =7(0)[1 — n(e;)| in formula 
[1]. Notice that the current Wo,.=7(0)[1 — n(e;)] 
= mean a(1 — a) with respect to the invariant = 
v.', suggesting the Euler rescaling of time 6(N) = 

Let < be the partial order on £y defined by 7 € a 
if n(x) € £(x) for every x in T£. The asymmetric 
exclusion process is attractive: there exists a 
stochastic evolution on £y x Ex with the following 
two properties: (1) it preserves the order, in the 
sense that m < £, for all t > 0 if no € £o and (2) each 
coordinate evolves according to the original asym- 
metric exclusion dynamics. This coupling, which 
may be constructed by letting particles jump 
together as much as possible, is the main tool in 
the derivation of the hydrodynamic equation of 
asymmetric processes. 

Fix a smooth function G:T? — R and recall 
definition [5] of the martingale Mf^". An elemen- 
tary computation shows that the quadratic variation 
of this martingale vanishes as N f? oc. On the other 
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hand, after an integration by parts, the integral term 
of the martingale becomes 


[x r NO (Vn A) (x/N)nen(x) 


LET l xeT4, 
x [1 — nav (x + ej)] ds 


Assume that the state of the process at any 
macroscopic time s is close to a product measure 
associated to some profile p(s,:). Since the martin- 
gale vanishes asymptotically, taking expectations in 
[5], we obtain that the density profile should be a 
weak solution of the quasilinear hyperbolic 
equation 


d 
Or, + 9 Ou,F(p) = 0 [12] 
j= 
where F(a) =a(1 — a). 

It is well known that solutions of this equation 
may develop shocks even if the initial profile po( - ) is 
smooth and that there is no uniqueness of weak 
solutions. Several criteria have been introduced to 
select the relevant solution among the weak solu- 
tions. Kruzkov (1970), for instance, in the case 
where density profile oo: T^ — R is bounded, 
proved that there exists a unique measurable 
function p which satisfies the entropy condition 


d 
Alp — c| + Y 8| F() 


=] 


- Ricy| <0 [13] 


in the sense of distributions on (0,00) x Tf, for 
every c € R, and which converges to the initial 
condition in L'('T7) as t] 0: lim, .o Ilp: — poli =i}, 

Fix T » 0 and a density profile po: I7 — [0,1]. 
To couple the original process with another one 
starting from a different initial sate, we need to 
impose the initial distribution to be of product form. 
Consider, therefore, a sequence of "product" prob- 
ability measures ju‘ associated to po and recall the 
definition of the sequence of measures Q, given in 
the section “The entropy method," assumed to be 
tight. 

We have to prove that all limit points are 
concentrated on entropy solutions of [12]. Coupling 
the original process rj with another one, denoted by 
£i, starting from the Bernoulli product measure with 


density o, and examining the time evolution of 
» Tt ImN(x) —£N(x), we derive an entropy 
inequality at the microscopic level: let aN be a 


sequence of probability measures on the product 
space En x En whose first coordinate is pu^. 
Denote by P^, the measure on the path space 
D([0, T], £x x En) induced by A and the coupling 


informally described at the beginning of this section. 
Rezakhanlou (1991) proved the following theorem: 


Theorem 5 For every smootb positive function H 
with compact support in (0,00) x T^ and every 
E > 0. 


lim Jim | P^N n dt N` ^ S {3H (t.x/ N) m; (x) — € (x)| 


xe 


d 
+X (O4 H)(t.x/ N)|F (nf (x)) cco = f =l 
i-1 

If we now assume that the second coordinate £, is 
initially distributed according to the stationary state 
v^, it is not difficult to replace 5 in the above 
formula by a, obtaining a microscopic version of the 
entropy iraa: | 

In the one-dimensional nearest-neighbor case, by 
coupling arguments, we may replace the average 
n (0) over a large microscopic box by an average 
if ^"(0) over a small macroscopic box, deriving the 
entropy inequality [13]. To conclude the proof it 
remains to show, by means of coupling argument 
again, that the density profile at time t converges in 
L'('T4) to the initial condition as t | 0. 

In higher dimensions or in the one-dimensional 
non-nearest-neighbor case, it has not been proved 
that replacement of 7'(0) by 7°%(0) is allowed. One 
is thus forced to consider measure-valued solutions 
of eqn [12]. Details can be found in Kipnis and 
Landim (1999, chapter 8). 


Relative Entropy Method 


The relative entropy method, due to Yau (1991), is 
based on the analysis of the time evolution of the 
entropy of the state of the process with respect to 
the product measure associated to the solution of the 
hydrodynamic equation. 

While the entropy method requires uniqueness of 
weak solutions and proves the existence of weak 
solutions, the relative entropy method requires the 
existence of a smooth solution and proves the 
uniqueness of such smooth solutions. 

Consider the exclusion process with rates c;(7) = 
1 + [n( —e;) + 7(2e;)]. We have seen that the hydro- 
dynamic equation of this model is given by the 
nonlinear parabolic equation 


Oy = Afp p^] [14] 


Fix a profile po: T^ — [0,1] bounded away from 
0 and 1:0 < ó € po(u) € 1— 6. Let p(t,u) be the 
solution of the hydrodynamic equation [14] with 
initial condition po and denote by Une.) the product 
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measure with slowly varying parameter associated 
to the profile p(t, -): 


Vy UT n(x) =1}=p(t,x/N), for x€ T$, 


Theorem 6 Let (u":N >1} be a sequence of 
probability measures on En whose entropy with 
respect to ae is of order o(N4): 


Nj, N d 
Hy (u ua) = o(N*) 
Then, the relative entropy of the state of the process 
at the macroscopic time t with respect to v, , is 
also of order o(N®): 


Hn (uS WX,» =o(N*) for every t > 0 


It is not difficult to deduce from this result a 
strong version of the hydrodynamic limit behavior 
of the interacting particle system: 


Corollary 1 Under the assumptions of the theorem, 
for every cylinder function V and every continuous 
function H: TY — R, 


N X` H(x/N)n.W() 


xETS, 
i 


The relative entropy method can be extended to 
nongradient systems and to asymmetric processes, 
whose macroscopic evolution is described by quasi- 
linear hyperbolic equations, up to the first shock. 

The hydrodynamic behavior of an interacting 
particle system corresponds to a law of large 
numbers for the empirical measure. The central 
limit theorem is well understood in equilibrium, but 
remains to this date an important open question in 
nonequilibrium. The large deviations for diffusive 
systems have also been investigated, as well as the 
hydrodynamic behavior of systems in contact with 
reservoirs. The Navier-Stokes equations have been 
derived as a correction of the hydrodynamic 
equation of asymmetric particle systems. We refer 
to Kipnis and Landim (1999) for further details. 


lim E N ÇN 
ers p" s 


- J, H(u)W(p(t, u)) du 


See also: Boltzmann Equation (Classical and Quantum); 
Bose-Einstein Condensates; Breaking Water Waves; 
Fourier Law; Interacting Stochastic Particle Systems; 
Macroscopic Fluctuations and Thermodynamic 
Functionals; Multi-Scale Approaches. 
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Introduction 


According to the basic principles of mechanics, the 
motion of atoms and molecules is governed, in the 
semiclassical approximation, by the deterministic 
Hamiltonian equations of motion. While all evi- 
dence points in this direction, for many problems 
this Hamiltonian approach is so complicated that it 
hardly yields any useful results. A simple example 
are many (10?) polystyrene balls (size 1m) 
immersed in water. The Hamiltonian description 
would have to deal with the degrees of freedom of 
all the fluid molecules and all the polystyrene balls. 
Clearly, a more useful approach is to collect the 
incessant bombardment of a polystyrene ball by 
water molecules into a stochastic force acting on the 
ball with postulated statistical properties. For 
example, following Einstein, one could regard 
successive collisions as independent and occurring 
after an exponentially distributed waiting time. In 
addition to such stochastic forces, the polystyrene 
balls are charged and interact with each other 
through the screened Coulomb force. 

On the one-particle level, stochastic models have a 
long tradition within statistical physics. Considerable 
part of the classical theory of Markov processes is the 
mathematical response to such type of description. 
The aspect of interaction is more recent. Its origin can 
be traced back to the Metropolis algorithm in early 
computer simulations (& 1953). It was recognized 
that the Hamiltonian dynamics is a rather slow tool 
to statistically sample the Gibbs equilibrium distribu- 
tion Z! exp[- H/kg T]. A more efficient route is to 
devise a stochastic algorithm which has as its unique 
stationary measure the Gibbs distribution. Such 
schemes are now known as Markov Chain Monte 
Carlo and of extremely wide use, not only in 
statistical physics but also in quantum chromody- 
namics (QCD) and other quantum field theories. The 
time appearing in the stochastic algorithm has no 
physical significance; it merely counts how often a 
certain operation is performed. 

The second clearly identifiable push toward the 
use of interacting stochastic particle systems came 
from the study of critical dynamics. Close to a point 
of second-order phase transition, the equilibrium 
properties are very effectively handled by means of 
statistical field theories. Thus, it was natural to 
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search for an extension into the time domain, which 
then led to time-dependent Ginzburg-Landau the- 
ories, where now time refers to physical time. These 
are interacting stochastic models, where one keeps 
only a few basic fields, together with their behavior 
under time reversal, their vector character, and 
whether they are dynamically conserved or not. 

In probability theory, interacting stochastic particle 
systems date back to the seminal papers by M Kac in 
1956 and independently by R L Dobrushin and by 
F Spitzer in 1970. Spitzer was motivated by spin-flip 
and spin-exchange dynamics, while Dobrushin had 
the vision of many locally interacting components. In 
the early days, one of the prime goals was the 
construction of the stochastic process in infinite 
volume, an enterprise which had important mathe- 
matical spin-off, for example, the theory of Dirichlet 
forms on function spaces. Physical models offer a rich 
menu to the probabilist, but there is also considerable 
input from other areas. To give just one example: in 
queueing theory one considers queues in series, that 
is, a customer served at one counter immediately 
moves on to the next one. If one regards as field the 
number of customers at each counter, one has an 
interacting stochastic particle system, the interaction 
being mediated through the servers. 

This article is split into two sections. In the first 
one, we list and explain a few prototypical interact- 
ing stochastic particle systems. Of course, the list is 
hardly exhaustive and we restrict ourselves from the 
outset to models from statistical physics. In the 
second part, we summarize prominent lines of recent 
research. Again the wealth of material is over- 
whelming and we draw the line according to the 
rules of mathematical physics. 


Model Systems 


Our list is determined by the intrinsic mathematical 
properties of the stochastic particle system. Alter- 
natively, a classification is possible according to the 
physical system, which would, however, be less 
transparent for our purposes. We restrict ourselves 
to models with only position-like degrees of free- 
dom, but if needed velocity-like fields may be 
included. The most basic distinction is the behavior 
under time reversal. A model is called (statistically) 
“time reversible” if a particular history and its time- 
reversed image have the same probability. Techni- 
cally, one imposes this through the condition of 
detailed balance. Nonreversible systems are much 
less explored, but currently a very active area of 
research. 


Reversible Models 


1. Spin-flip, Glauber dynamics. One considers 
spins attached to the sites of a regular lattice, 
which for symplicity we take as the hypercubic 
lattice Z^. The spin at site x € Z is denoted by 
7. — t1 and the whole spin configuration is 
denoted by c. Thus, the state space of the Markov 
process is [-1,1]^ = Spin configurations 
evolve in time through random spin flips, that is, 
through a change from o, to —o, according to 
configuration-dependent rates c(a). c(a) is local, 
in the sense that it depends only on the spins close 
to x, and is translation invariant, that is, if 7, is 
the shift by y, then c,,(7,0) = c«(v). If the current 
spin configuration is c(t), then after a short 
time dt 


with probability 1 — c.(o(t)) dt 


u ax(t) 
ax(t-4- dt) = i with probability c,(o(t)) dt 


—ax(t) 


The update is performed independently at each 
lattice site. Technically, it is more concise to specify 
the generator, L, of the Markov process. It acts on 
local functions f:Q —5 R and is given by 


g) = x cx(o)(f (o^ 


xcz4 


) - f(e)) H 


where o* denotes the configuration ø with the spin 
at site x reversed. The transition probability from 
the configuration c to the configuration o’ in time 
t > 0 is given by the matrix element (e), „ of the 
Markov semigroup e” 

To impose time reversibility, one needs an energy 
function H(c) constructed according to the rules of 
equilibrium statistical mechanics. The condition of 
detailed balance then reads 


cx (o) = cxlo” e BHI) Ho) pi 


with G=1/kpT the inverse temperature. Note that 
on the right only energy differences appear, which 
are always well defined. In finite volume the 
unique invariant measure is the Gibbs measure 
i le SH 

2. Spin- exchange, Kawasaki dynamics, stochastic 
lattice gases. We model particles hopping on the 
lattice Z^ and switch to the occupation variables ny, 
where 75,-0 stands for site x empty and y.—1 
stands for site x occupied. The state space is 
Q— (0, 1]^ . Since the number of particles is con- 
served, the basic dynamical process is a random 
jump of a particle from x to a nearby site y, 
provided 7, — 0. Therefore, we specify the exchange 
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rates Cxy(ņ) between x and y. They are local, 


translation invariant and symmetric, that is, 
Cxy(1]) = cyx(r]). The generator now reads 
3 Cxy(m)(F Or”) — f (0)) [3] 


- yezA 


where 1 is the configuration 7 with the occupan- 
cies at sites x and y exchanged. 

The condition of detailed balance refers to the 
exchange and reads 


B(H (17) 


(i? Jew —H(n)) [4] 


Cxy(1]) — €xy 
In [4] we can freely add to H the chemical potential 
-H $ Nx. Thus for stochastic lattice gases there is a 
one-parameter family of invariant measures, labeled 
by the chemical potential p. 

3. Interacting Brownian motions. These motions 
model, for example, suspensions as mentioned in the 
“Introduction”. One considers a box A c R^ con- 
taining N Brownian particles. The jth Brownian 
particle has position x; € A. Thus, the state space of 
the Markov process is AN. We assume that the 
Brownian particles interact through a (sufficiently 
local) even pair potential U. Then the total potential 
energy is 


N 


H(x) - 5 Y Ui- x) 


i j=l 


= E a PEE [5] 


The dynamics of the Brownian particles is given 
through the stochastic differential equations 


+ aan 


W;(t),] —1,..., N, are a collection of independent 
Brownian motions and Dp is the diffusion coeffi- 
cient of a single Brownian particle. Equation [6] has 
to be supplemented with suitable boundary condi- 
tions at the surface OA. Since the forces in [6] are the 
gradient of a potential, time reversibility is auto- 
matically satisfied with the invariant measure being 
Z exp(— H(x)/ Do) dx; -++ dxw. 

4. Ginzburg-Landau models. Ginzburg-Landau 
models should be viewed as discretized versions of 
stochastic partial differential equations. At every 
lattice site x € Z^, there is a real-valued field 
x € R, a field configuration, being denoted by 6. 
Formally, the state space is R^ . Since the single-site 
space is noncompact, some — condition at 


TANE EE. 


132 Interacting Stochastic Particle Systems 


infinity must be imposed. Next we give ourselves an 
energy, H(à), one standard example being 


H()-2 M (%0) +Y Vi) [7 
x yeZ’ |x—y|-1 sez” 


The on-site potential increases sufficiently rapidly, so as 
to make large field values unlikely. The ó-field evolves 
according to the set of stochastic differential equations 


OH 
-gg OOd V2/BAWa(t), og 


x c Z1 


where (W.(t),x € 74) is a collection of independent 
Brownian motions. If V(ó,)— 62, then ó(ft) is 
a Gaussian field theory. To have an Ising-type phase 
transition, one would have to choose V(x) = Adz + 94. 

It is rather simple to modify [8] as to incorporate 
a conservation law. To each directed bond (x, y), 
|x — y| - 1, one associates the current jey = —j,,. If 
e is a unit vector, |e| — 1, then 


déx(t)+ M jxxpe(t)dt=0, xez’ [9 
e,|e|=1 


The current has both a deterministic part, given 
through the gradient of a chemical potential, and a 
random part: 


OH OH 
ixy(t)dt = — — —— |(¢(t))dt + dW,,(t), 
is (d =~ (3 a) COW), c 
<= Hd 
where W,,(t)= — Wyx(t) is a collection of indepen- 


dent Brownian motions labeled by nearest-neighbor 
bonds. The conserved quantity is » ^, ó.. Again, the 
dynamics has a one-parameter family of stationary 
measures labeled by the “magnetic field”. Since in 
[8] and [10] the drift is the gradient of a potential, 
Ginzburg-Landau models are reversible. 

5. Interface dynamics. The scalar field ¢ describes 
the location of an interface. The energy of an 
interface does not depend on its absolute displace- 
ments. Thus, interface models are special Ginzburg- 
Landau models, which have an energy H(ó) 
invariant under the global shift ó, — ¢, +a for all 
x € Z^. An example is 


H(j- > 


x yezi |x—y|=1 


V( dx m Qy) [11] 


with even V. Note that in order to have a normal- 
izable equilibrium measure, the interface must be 
pinned somewhere. 

6. Several components. For lattice gases, there may 
be several components. In a Ginzburg-Landau theory 


instead of a scalar, Ising-like field, one could consider a 
vector-valued, Heisenberg-like, field and require the 
energy to be invariant under global rotations of the field 
variables. The construction is as before and we do not 
have to repeat it. 

7. Constrained, glassy dynamics. The constraint is 
enforced by setting some of the rates equal to zero. 
For example, in the case of standard Glauber 
dynamics, one could allow for a spin-flip only if at 
least two neighboring spins have the opposite sign. 
The Gibbs measure is still invariant, but the approach 
to equilibrium will be slowed down due to the 
constraint. It may even happen that the configuration 
space splits into several invariant subsets. 

After this long and still incomplete list, let us turn 
to the nonreversible models. 


Nonreversible Models 


Mathematically, one merely has to drop the condition 
of detailed balance. To have a more concrete example, 
let L; be the generator for the Glauber dynamics 
satisfying detailed balance with inverse temperature 
Bi, i= 1,2. Then L = Lı + Lz generates a nonreversible 
dynamics provided 5; Z (2. Physically, it corresponds 
to coupling the spins to two bulk thermal reservoirs of 
different temperatures. Our example leads to a general 
point which should be noted: While reversible models 
have a wide range of physical applicability, for 
nonreversible models nonequilibrium conditions have 
to be maintained over sufficiently long time spans, 
which poses considerable difficulties experimentally. 
Thus on a theoretical level, the efforts go into exploring 
properties of, say, semirealistic models. 

Very roughly there are two broad classes of 
nonreversible models. 


Boundary-driven models We consider a finite 
volume A. Inside A the dynamics is reversible as 
explained before. At the boundary OA the system is 
coupled to particle, resp. energy, reservoirs. In case the 
boundary chemical potential, resp. temperature, is not 
uniform, the dynamics is nonreversible. To be more 
concrete let us reconsider the lattice gas discussed in 
item (2) (see the discussion following eqn [2]). Inside 
A the generator L4 is given by [3] and satisfies 
detailed balance [4]. The boundary generator is 


La f (n) = X ex(mfGr) -f() — 2] 


xcà^ 


where the notation is as in [1] with (—1,1] 
substituted by {0,1}. cx(ņ) satisfies [2] with the 
same ĝ as in the bulk, but a chemical potential py 
depending on x € OA. ux controls the injection/ 


absorption of particles at x. The generator for the 
nonreversible dynamics is then 


L= L4 +4 La [13] 


Bulk-driven models A prototype is the two- 
temperature model mentioned above. More widely 
studied is a nonconservative force acting globally. 
Here the standard example are particles moving in A 
with periodic boundary conditions and subject to an 
additional uniform force field of strength F, which 
clearly cannot be written as the gradient of a 
potential. In the case of Brownian particles, by 
changing to a comoving frame of reference, one 
would be back to the reversible case F=0. For 
lattice gases the lattice provides a fixed frame and 
the driven model has properties very different from 
the undriven one. This leads us to: 

8. Driven lattice gases. The generator L is still 
given by [3]. Formally, we insert in [4] instead of H 
the Hamiltonian H(n) — >>, (F- x)nx. The exchange 
rates then satisfy the condition of “local” detailed 
balance as 


Cuy() = e, (59) e PVA) 


x e PP x) —ny) [14] 


This means, particles preferentially jump in the 
direction of F. On the infinite lattice the dynamics 
admits two classes of stationary measures. First, 
there is the Gibbs measure with particles piling up 
along F and formally given by 


—B(H(1)—M (F-x)nx) 


Ze [15] 


With respect to this measure the dynamics is 
reversible. Second, there are translation invariant 
measures with nonzero steady-state current. This 
cannot happen for reversible models. A very widely 
studied particular case is the asymmetric simple 
exclusion process for which d=1,H(n)=0, and 
jumps are only to nearest-neighbor sites. 


Items of Interest 


As there are thousands of research papers in 
mathematical physics alone, it is literally impossible 
to provide any sort of summary. On the other hand, 
the type of questions investigated are generic. Thus, 
we just explain what one would like to understand 
without paying much attention to the fractal 
boundary between “proven” and “unproven.” For 
the construction of the stochastic processes listed 
above, there is a well-developed probabilistic theory 
available. Thus, the main focus is on "qualitative 
properties" of the stochastic particle system. As in 
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the previous section, we distinguish between rever- 
sible and nonreversible models. 


Reversible Models 


1. Equilibrium state. The most basic question 
concerns the classification of invariant measures in 
infinite volume. By construction, they are the Gibbs 
measures for the Hamiltonian appearing in the condi- 
tion of detailed balance. In principle there could be 
more, which so far has been excluded only in dimension 
1 or 2. Properties of the invariant measure belong to the 
domain of equilibrium statistical mechanics. 

Thus we can turn directly to: 

2. Spectral analysis of tbe generator L. We fix 
some extreme Gibbs measure stationary for L. 
By detailed balance, e! is a symmetric Markov 
semigroup in L^(Q, jj). Hence, L is self-adjoint and 
L < 0. Furthermore, it has a nondegenerate eigen- 
value 0. The rate of approach to equilibrium is 
determined by the spectral gap of L. Related are log- 
Sobolev inequalities which serve as a stronger 
notion. For models with a conservation law, there 
is no spectral gap. Thus, the more appropriate 
question is to study how fast the gap vanishes as 
the volume A increases. In the case of independent 
components, the spectral subspaces for L are 
organized as single excitation, double excitation 
etc. Such a structure persists as the interaction is 
turned on which, on a mathematical level, is similar 
to the particle spectrum of a quantum field theory. 

Physically more directly relevant are: 

3. Spacetime correlations. To be concrete, let us 
consider a Ginzburg-Landau field theory @,(t) 
starting with a translation invariant Gibbs measure 
yu. Then @,(t) is a spacetime stationary process. The 
two-point correlation function is the covariance 


(bx(t)0(0)) — ($9(0))" [16] 


Its Fourier transform is directly linked to energy- 
momentum resolved scattering intensity from a probe 
which is modeled by the respective Ginzburg-Landau 
theory. For t=0, the expression [16] is the static 
correlation, again belonging to the domain of equili- 
brium statistical mechanics. The time decay depends 
on whether the field is dynamically conserved or not. 
Correlation functions do not always capture the 
physics of the system well. This is certainly true for: 
4. Dynamics at low temperatures. Let us consider 
the Glauber dynamics for the ferromagnetic Ising 
model in the finite but large volume A. Then there is 
a very high free energy barrier between configura- 
tions typical for the + phase and those typical for the — 
phase. If one starts the spin system in the + phase, one 
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may study through which configurations the system 
moves to the — phase and how much time such a 
process will take. If the two phases are symmetric with 
the external magnetic field 5 —0, the spin system 
tunnels, while for h < 0 and small the + phase is 
metastable. Another widely studied situation, also 
experimentally, is the quenching from high to low 
temperatures. In our context this means that the initial 
measure is Bernoulli, while the Glauber dynamics runs 
at low temperatures. Then spin clusters coarsen as time 
proceeds developing well-defined interfaces which are 
governed through motion by mean curvature. 

Close to a point of second-order phase transition, 
one has to deal with: 

5. Critical dynamics. The usual Glauber dynamics 
becomes very slow at the critical point and reliable 
equilibrium is hard to achieve. It is thus a challenge 
to design faster algorithms. One proposal is the 
Swendsen-Wang algorithm which is based on the 
Fortuin-Kasteleyn representation and flips a whole 
cluster of spins simultaneously. 

So far we concentrated on statistical properties. 
Researchers have been fascinated by the observation 
that for stochastic particle systems, the transition to a 
deterministic macroscopic evolution can be handled 
with full rigor. Such a program has been baptized: 

6. Hydrodynamic limit, which is meaningful only 
for particle systems with one or several conservation 
laws. Let us discuss then a reversible lattice gas with 
Hamiltonian H. We start the dynamics with a state 
of local equlibrium which is Gibbs with a slowly 
varying chemical potential, that is, 


Z exp|-6| H(n) - 5 w(ex)m ||, «1 [17] 


Such a measure is almost time invariant. For small e, 
at least approximately, such a structure should 
persist in the course of time at the expense of 
properly regulating the chemical potential. For our 
example, the correct timescale is € ^t in microscopic 
units, and the evolution equation for the density, 
related thermodynamically to the chemical poten- 
tial, is a nonlinear diffusion equation of the form 


o 
a,P = V D(o) Vor [18] 


We turn to the nonreversible models. 


Nonreversible Models 


While for reversible models the study of the 
stationary Gibbs measure is its own field of inquiry, 
here the first entry must be: 


7. Nonequilibrium steady state. This steady state is 
determined through the dynamics, since the stationary 
measure u has to satisfy u(Lf)=0 for a sufficiently 
large class of functions f. As in equilibrium, phase 
transitions may occur. In the nonconservative case it 
would mean that the infinitely extended system has 
several extreme stationary measures. In the conserva- 
tive case, say with the density as locally conserved field, 
it would mean that there is an interval of densities for 
which there is no extreme stationary measure. Given 
the nonequilibrium steady state, one may wonder 
about its typical fluctuations and large deviations. In 
contrast to thermal equilibrium, weak long-range 
correlations are the rule. 

8. Spacetime correlations in the steady state. 
Through the bulk drive the power-law decay of time 
correlations may change. For example for the sym- 
metric and asymmetric exclusion process, the steady 
states are Bernoulli with density p, denoted by (-) ,. For 
the on-site density-density correlation, one finds, for 
large t, 


for F = 0 


for F £0 d 


-1/2 

(ro (£)mo(0)) 1/5 — ; = l at 

9. Hydrodynamic limit. The concept of slowly 

varying conserved fields remains valid; only local 

equilibrium must be replaced by local stationarity. 

Generically, there are nonzero currents in the steady 

state. Therefore, the macroscopic fields change on 

the timescale e't (cf. item (5)) and are governed by 
a hyperbolic conservation law of the form 


o e 
Sh dvip)-o qo 


in the case of a single conservation law. Here, /(p) is 
the average steady state in the stationary measure at 
density p. Several conservation laws have an intri- 
guing rich variety of solutions. Even on the level of 
continuum partial differential equations, such sys- 
tems of hyperbolic conservation laws still pose 
unresolved basic problems. 


See also: Ginzburg-Landau Equation; Glassy Disordered 
Systems: Dynamical Evolution; Interacting Particle 
Systems and Hydrodynamic Equations; Macroscopic 
Fluctuations and Thermodynamic Functionals; Stochastic 
Differential Equations. 
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introduction 


Many important industrial problems involve flows 
with multiple constitutive components. Examples 
include extractors, separators, reactors, sprays, poly- 
mer blends, and microfluidic applications such as DNA 
analysis, and protein crystallization. Due to inherent 
nonlinearities, topological changes, and the complexity 
of dealing with unknown, active, and moving surfaces, 
multiphase flows are challenging. Much effort has been 
put into studying such flows through analysis, asymp- 
totics, and numerical simulation. Here, we focus on 
review on studies of multicomponent fluids using 
continuum numerical methods. 

There are many ways to characterize moving 
interfaces. The two main approaches to simulating 
multiphase and multicomponent flows are interface 
tracking and interface capturing. In interface-tracking 
methods (examples include boundary-integral, 
volume-of-fluid, front-tracking, immersed-boundary, 
and immersed-interface methods), Lagrangian (or 
semi-Lagrangian) particles are used to track the 
interfaces. In (BIMs), the flow equations are mapped 
from the immiscible fluid domains to the sharp 
interfaces separating them thus reducing the dimen- 
sionality of the problem (the computational mesh 
discretizes only the interface). In interface-capturing 
methods such as level-set and phase-field methods, 
the interface is implicitly captured by a contour of a 
particular scalar function. 

The equations governing the motion of an 
unsteady, viscous, incompressible, immiscible two- 
fluid system are the Navier-Stokes equations (the 
subscript i denotes the ith flow component): 


Ou; ; 
o ew Nw) = Vm pg, f=1,2 M 


gj; — —pil E E 27;D; [2] 


where p; is the density, u; is the fluid velocity, p; is 
the pressure, 7; is the viscosity, and g is the 
gravitational acceleration vector. In eqn [2], o; is 
the stress tensor, I is the identity matrix, and D; is 
the rate of deformation tensor and defined as 
D; — (1/2)(Vu; + Vul). The velocity field is subject 
to the incompressibility constraint, 


V - 4; =0 i3] 


We let I denote the fluid interface. The effect of 
surface tension is to balance the jump of the normal 
stress along the fluid interface. This gives rise to a 
Laplace-Young condition for the discontinuity of 
the normal stress across I: 


lon] = TKN [4| 


where [ø]r denotes the jump o» — e, across I, is 
the curvature of T (positive for a spherical interface), 
T is the surface tension coefficient which is assumed 
to be constant, and n is the unit normal vector along 
[ directed toward fluid 2. The fluid velocity is 
continuous across I’. 

In order to circumvent the problems associated 
with implementing the Laplace-Young calculation 
at the exact interface boundary, Brackbill and 
collaborators developed a method referred to as 
the continuum surface force (CSF) method. See the 
review by Scardovelli and Zaleski (1999). In this 
method, the surface tension jump condition is 
converted into an equivalent singular volume force 
that is added to the Navier-Stokes equations. 
Typically, the singular force is smoothed and acts 
only in a finite transition region across the interface. 
The system of equations [1]-[2] and the boundary 
condition, eqn [4] can be combined into the 
following distribution formulation that holds in 
both phases: 


p(u, +u- Vu) — — Vp + V - (2nD) + pg + Fang: 
V-“u=0 [5] 


where the subscript i is dropped (i.e., it is under- 
stood that u = u; in fluid i, etc.,) and Fying is singular 
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surface tension force that is given by Fying = 
where ôr is the surface delta-function. 


—TKÓ[H, 


Numerical Methods for Multicomponent 
Fluid Flows 


Interface-Tracking Methods 


Boundary-integral methods (BIMs) BIMs can be 
highly accurate for modeling free surface flows 
with relatively regular interface topologies. The 
BIM was apparently first used by Rosenhead in 
1932 to study vortex sheet roll-up. In this 
approach, the interface is explicitly tracked, but 
the flow solution in the entire domain is deduced 
solely from information possessed by discrete 
points along the interface. 

BIMs have been used for both inviscid and Stokes 
flows. For a review of Stokes flow computations, see 
Pozrikidis (2001), and for a review of computations 
of inviscid flows, see Hou et al. (2001). For flows 
with both inertia and viscosity, volume integrals 
must be incorporated into the formulation. 

When inertial forces are negligible (left-hand side 
term of eqn [1] is dropped), the velocity s(xo) at a 
given point xo on the interface can be obtained by 
means of the boundary-integral formulation, 


2u (xo) z^ f (x)G(xo, x 
n(x) ds(x) [6] 


(A + 1)u(xo) = 


perm T (xo, x) - n(x) ds(x) [7] 


where A is the viscosity ratio, 44, is an imposed 
velocity prevailing in the absence of the interfaces, and 
f(x) is the capillary force function f = 7&. The tensors 
G and T are the Stokeslet and stresslet, respectively: 


I x 
G(xo,x) =- + ES 
F Lw : [8] 
6XXX 
T (xo, ) = r5 


where & = x — xo, r-|X| [9] 


The boundary conditions at the interface, that is, the 
stress balance equation [4] and continuity of the 
velocity across the interface, are automatically 
satisfied by the boundary-integral formulation. 

The normal velocity of the interface T(x,t) is 
given by 


dx 


3: -n(x) = u(x,t) - n(x) [10] 


The shape of the interface does not depend on the 
tangential velocity and there are many possible 
choices that can be taken, see Hou et al. (2001). 

The principal advantages gained by using BIMs 
are the reduction of the flow problem by one 
dimension since the formulation involves quantities 
defined on the interface only and the potential for 
highly accurate solutions if the flow has topologi- 
cally regular interfaces. In addition, highly efficient 
adaptive surface mesh refinement algorithms have 
recently been developed to improve the performance 
and accuracy of the methods (Cristini et al. 2001). 
The main disadvantages are the development of 
accurate quadratures of integrals with singular 
kernels (particularly in 3D) and the need for local 
surgery of the interface,in the event of topological 
changes. 

BIMs have been successfully used for simulations 
of complex multiphase flows: drop deformation and 
breakup; jets; capillary waves; mixing; drop-to-drop 
interaction; suspension of liquid drops in viscous 
flow (e.g., see Cristini et al. (2001), Hou et al. 
(2001), and Pozrikidis (2001) and the references 
therein). 


Volume-of-fluid (VOF) method In the VOF 
method (see Scardovelli and Zaleski (1999) for a 
recent review), the location of the interface is 
determined by the volume fraction cj; of fluid 1 in 
the computational cell, Q;. In cells containing the 
interface 0 < cy < 1,cj; = 1 in cells containing fluid 1, 
and cj;—0 in cells containing fluid 2 as shown in 
Figure 1b. 

A VOF algorithm is divided into two parts: a 
reconstruction step and a propagation step. A 
typical interface reconstruction is shown in 
Figure 1c. In the piecewise linear interface construc- 
tion (PLIC) method, the true interface, as shown in 
Figure 1a, is approximated by a straight line 
perpendicular to an interface normal vector nj in 
each cell Q;. The normal vector nj; is determined 
from the volume fraction gradient using data from 
neighboring cells. With given a volume fraction cj; 


(a) (b) (c) 


Figure 1 VOF representation of an interface: (a) actual 
interface, (b) volume fraction, and (c) an approximation to the 
interface is produced using an interface reconstruction method 
such as piecewise linear approximation as shown. 


and a normal vector mj, the interface is given by the 
straight line with normal n;; such that area beneath 
the line in cell Q; is equal to cj. More recently, 
parabolic reconstructions of the interface have been 
used to gain higher-order accuracy for the surface 
tension force (e.g., the “parabolic reconstruction of 
surface tension" or PROST algorithm). 

Once the interface has been reconstructed, its 
motion by the underlying flow field must be 
modeled by a suitable advection algorithm. The 
key here is that the explicit interface reconstruction 
enables fluxes to be developed that exactly conserve 
mass and do not diffuse the interface. 

Capillary effects may be represented by the 
continuous surface stress (Scardovelli and Zaleski 
1999. 


T—--1(I-n&n)Vée, Feing=—-V-T [11] 


where č is a smoothed version of the volume 
fraction. For the flows in which the capillary force 
is the dominant physical mechanism, the PROST 
algorithm discussed above can be used to signifi- 
cantly reduce spurious currents due to inaccurate 
representation of surface tension terms and asso- 
ciated pressure jump in normal stress. 

The distribution form of the fluid equations [5] is 
typically solved using a variant of the projection 
method for incompressible single phase flows. 

VOF methods are popular and have been used in 
commercial multiphase flow codes, in models of 
inkjet printers, flows with surfactants and in many 
other applications (e.g., see Scardovelli and Zaleski 
(1999) and James and Lowengrub (2004) and the 
references therein). The principal advantage of VOF 
methods is their inherent volume-conserving prop- 
erty. Nevertheless, spurious bubbles and drops may 
be created. The reconstruction of the interface from 
the volume fractions and the computation of 
geometric quantities such as curvature are typically 
less accurate than other methods discussed here 
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since the curvature and normal vectors are obtained 
by differentiating a nearly discontinuous function 
(volume fraction). 


Front-tracking methods The basic idea behind the 
original front-tracking method is the use of two 
grids as illustrated in Figure 2. One is a standard, 
Eulerian finite difference mesh that is used to solve 
the fluid equations. The other is a discretized 
interface mesh that is used to explicitly track the 
interface and compute surface tension force which is 
then transferred to the finite difference mesh via a 
discrete delta-function. Front tracking was first 
proposed by Richtmyer and Morton and further 
developed by Glimm and co-workers. 

A similar approach was taken by Unverdi and 
Tryggvason (see Tryggvason et al. (2001) and Peskin 
(2002) for recent reviews), who combined a moving 
grid description of the interface with flow computa- 
tions on a fixed grid. In this immersed-boundary 
approach, all the fluid phases are treated together by 
solving a single set of governing equations. This 
method has its roots in the original marker-and-cell 
(MAC) method, where marker particles are used to 
identify each fluid and the immersed-boundary 
method of Peskin and McQueen, that was designed 
to track moving elastic boundaries in homogeneous 
fluids. 

The interface is represented discretely by Lagran- 
gian markers that are connected to form a front 
which lies within and moves through a stationary 
Eulerian mesh. 

In Tryggvason's original implementation, the 
basic structural unit is a line segment. Since the 
interface moves and deforms during the computa- 
tion, interface elements must occasionally be added 
or deleted to maintain regularity and stability. In the 
event of merging/breakup, elements must be relinked 
to effect a change in topology. 

The interface is represented using an ordered list 
of marker particles xj =((x1),,(x2),), 1 <k € N. 


Ui, 1/2. j«1 


Vi j-1/2 


(c) 


Figure 2 (a) The basic idea in the front-tracking method is to use two grids — a stationary finite difference mesh and a moving 
Lagrangian mesh, which is used to track the interface. (b). Blow-up of the subgrid control volume in (a). (c) Control volume for the 


Eulerian mesh, €); ;. (1/2). 
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The first step in this algorithm is the advection of the 
marker particles. A simple bilinear interpolation is 
used to find the velocity inside each grid cell (indicated 
in Figure 2c). The marker particles are then advected in 
a Lagrangian manner. Once the points have been 
advected, a list of connected polynomials (p? (s), p? (s)) 
is constructed using the marker particles. This gives a 
parametric representation of the interface, with s 
typically an approximation of the arclength. Both 
lists are ordered and thus identify the topology of the 
interface. In later works, higher-order polynomials 
have been used (e.g., cubic splines) and semi-Lagran- 
gian evolutions have been implemented where other 
tangential velocities have been used. 

As the interface evolves, the markers drift along 
the interface following tangential velocities and 
more markers may be needed if the interface is 
stretched by the flow. Typically, the markers are 
redistributed along the interface to maintain an 
accurate interface representation. 

Next, we compute the surface tension force, 


Finit) =|. TK O(x — xy(s))nyds [12] 


where the subscript f means values evaluated at the 
interface I(t) and s is arclength. The discrete 
numerical implementation of this distribution onto 
the fixed grid is in the form of a sum over interface 
elements, x; ,: 


F(x) — f,6(x — xpp) Ase [13] 
k 


where As, is the average of the straight line 
distances from the point x; , to the two neighboring 
points xf ,,, and xy , 4 as indicated by the subgrid 
control volume shown in Figures 2a and 2b. The 
delta-function is typically taken to be Peskin’s 
discrete Dirac delta-function: 


ó(x — xy x) 
z.1 T|; (xy iJ ; 
— | 1--cos——————— | iflx—x;,| €2b 
- ast 2h exp 
0 otherwise [14] 


Other higher-order alternative forms of the regular- 
ized delta-function using the product formula have 
recently been proposed. 

Using the Frenet relation, the surface tension force 
on a short segment of the front is given by 


" 5 Ot, 
f: = J TK¢ny¢ds = J T ds = r(tg — ta) [15] 
A JA ôs 


where A and B are the segment endpoints that lie 
on the boundary of the subgrid control volume 


(Figures 2a and 2b), and f; is a tangent vector 
computed by fitting a polynomial to the endpoints 
of each element. 

In the case of flows with varying density and/or 
viscosity between the fluid components, there is a 
need to calculate the phase indicator function I(x, t) 
(defined by interface geometry and position), which 
has the value 0 in fluid 1 and 1 in fluid 2. The 
indication. function can be determined via the 
solution of the equation 


Alix t) = V - l. apt —x;(s,t)ds [16] 


This equation is discretized on the Eulerian mesh 
and a discrete delta-function (e.g., eqn [14]) is used. 
The fluid properties such: as density and viscosity are 
determined via the indicator function, that is, 
p(x,t) = pi + (p2 — p1)l (x, t), etc. 

As in the volume of fluid algorithm, the distribu- 
tion form of the Navier-Stokes equations [5] are 
typically solved using a version of Chorin's projec- 
tion method. 

An alternative flow solver that can be used to 
integrate the flow equations in the presence of an 
interface is the immersed-interface method (IIM). 
The IIM was developed by Leveque and Li (see the 
review Li 2003), and can be used together with 
front-tracking as well as level-set methods. 

The IIM directly incorporates jump conditions for 
the normal stress into the finite difference stencil. The 
key idea of this method is to use the jump conditions 
in Taylor series expansions of pressure and velocity 
near interfaces to derive difference equations that 
achieve pointwise second-order accuracy. 

The principal advantage of front-tracking algo- 
rithms is their inherent accuracy, due in part to the 
ability to use a large number of grid points on the 
interface. Front-tracking methods can be compli- 
cated to implement, particularly in 3D, but give the 
precise location and geometry of the interface. In 
addition, explicit front tracking permits more than 
one interface to be present in a single computational 
cell without coalescence, which can be important in 
dense bubbly flows, emulsions, etc. One of major 
handicaps of front-tracking methods is the difficulty 
in modeling topological changes of the interface 
such as breakup and coalescence without ad hoc cut- 
and-connect and reconnecting parameterized inter- 
face (particularly, difficulties in 3D). 


Interface-Capturing Methods 


Level-set method Level-set methods, introduced by 
Osher and Sethian (see the recent review papers 
(Osher and Fedkiw 2001, Sethian and Smereka 
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(a) (b) 
Figure 3 (a) Zero contour of o representing the interface T. 
(b) Surface of o with zero contour. 


2003) and the recent texts (Osher and Fedkiw 
2002, Sethian 1999)), are popular computational 
techniques for tracking moving interfaces. These 
methods rely on an implicit representation of the 
interface as the zero set of an auxiliary function 
(level-set function). The application of these meth- 
ods to incompressible, multiphase flows started with 
the work of Osher, Merriman, Sussman, Smereka, 
Hou, and their collaborators. 

In the level-set method, the level-set function 
ó(x, t) is defined as follows (see Figure 3): 


»0 ifxec fluid 1 
ó(x.t)à 20 if x €T (the interface between fluids) 
«0 if xe fluid 2 


and the evolution of ó is given by 
Ó; -u-Vo-—0 [17] 


which means that the interface moves with fluid. 
To keep the interface geometry well resolved, the 

level-set function ó should be a distance function near 
the interface. However, under the evolution [17] it 
will not necessarily remain as such. We note that 
special velocity extensions v off the interface (i.e., 
v=u at the interface, v Z 4 away from interface) 
have been recently developed to better maintain @ as 
a distance function (e.g., Sethian and Smereka (2003) 
and Macklin and Lowengrub (2005)). Typically, a 
reinitialization step (solving a Hamilton—Jacobi type 
equation, eqn [18]) below, is performed to keep ó as 
a distance function near the interface while keeping 
original zero-level set unchanged. More specifically, 
given a level-set function, @, at time t, the contours 
are redistributed according to the steady-state solu- 
tion of the equation 

pe = S.(6)(1 — |Vd|), 

OT 


where $, is the smoothed sign function defined as 


d(x,0)= (x) [18] 


D 
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where e is usually is one or two grid lengths. After 
solving eqn [18] to steady state (x,t) is then 
replaced by d(x, Tsteady). Note that d(x, Tsteady) is 
typically a good approximation of the signed 
distance function. 

The density and viscosity are defined as 


pio) = pa + (pı — p2)H-(¢) 


and 


n(ó) = m + (m — (m)H.(9) [20] 


where H,(ó) is the smoothed Heaviside function 
given by 


0 if ó « —e 
H.(¢) = 43|1*?-isin(só/e] if Jọ] <e 
if óc 


1 
The mollified delta-function is 6.(¢)=dH,/d@. The 


surface tension force is given as 


Vo M. Vo 
Fin =V I Óe pEr 21 
cow (atA P 


The fluid equations [5] are solved using projection 
methods, the IIM or the ghost-fluid (GF) method 
(e.g., Osher and Fedkiw (2001, 2002) and Fedkiw 
et al. (2003)). The GF method is similar to the IM 
in that jump discontinuities are incorporated in the 
finite difference stencil. In the GF algorithm, subcell 
resolution is used to mark the interface position and 
the values of discontinuous quantities are artificially 
extended to grid points neighboring the interface via 
extrapolation. A fully second order accurate GF 
method for moving interfaces has recently been 
developed (Macklin and Lowengrub 2005). 

Applications of the level-set method include 
multiphase flows, viscoelastic fluid flows and fluid- 
structure interactions (e.g., see the reviews Osher and 
Fedkiw (2001, 2002), Sethian (1999), and Sethian 
and Smereka (2003)). 

Advantages of the level-set algorithm include the 
simplicity with which it can be implemented, the 
ability to capture merging and breakup of interfaces 
automatically, and the ease with which the interface 
geometry can be described using the level-set 
function. A disadvantage of the level-set method is 
that mass is not conserved. 

Accurate numerical simulations of multiphase 
flow and topology transitions require the computa- 
tional mesh to resolve both the macroscales (e.g., 
droplet size, flow geometry) and the microscales to 
accurately capture local interface geometries near 
contact region, van der Waals forces, surfactant 
distribution, and Marangoni stresses. Adaptive mesh 
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Figure 4 Each of the first three figures has a boxed region that is magnified in the next figure. The rates of magnification are 5, 10, 
40/3, respectively. The meshes in the figure are used to simulate the drop-impacting interface problem. Source: Zheng X, Anderson A, 


Lowengrub JS, and Cristini V (unpublished). 


algorithms have recently been used greatly to 
increase accuracy and computational efficiency in 
level-set methods. Typically, the methods involve 
Cartesian adaptive mesh refinement. Problems 
tackled using this approach include droplet forma- 
tion in inkjet printers and wake development behind 
a ship. Another approach, recently developed, is to 
use adaptive unstructured mesh refinement (Zheng 
et al. 2005), as shown in Figure 4, in which the 
impact of a drop onto a fluid interface is captured. 


Hybrid Methods 


More recently, a number of hybrid methods, which 
combine good features of each algorithm, have been 
developed. These include coupled level-set volume- 
of-flud (CLSVOF) algorithms, particle level-set 
methods, marker-VOF methods and level-contour 
front-tracking methods. 

Level-set and VOF methods have recently been 
combined. The volume fraction is used to maintain 
volume conservation, while the level-set function is 
used to describe the interface geometry. After every 
time step, the volume-fraction function and level-set 
function are made compatible. The coupling 
between the level-set function ó and the VOF 
function c occurs through the normal of the 
reconstructed interface and through the fact that 
the level-set function is reset to the exact signed 
normal distance to the reconstructed interface 
(where the area below the reconstructed interface is 
given by the volume-fraction function). 

In the particle level-set method, Lagrangian 
disconnected marker particles are randomly posi- 
tioned near the interface and are passively advected 
by the flow in order to rebuild the level-set function 
in under-resolved zones, such as high-curvature 
regions and near filaments. In these regions, the 
standard nonadaptive level-set method regularizes 
excessively the interface structure and mass is lost. 
The use of marker particles significantly ameliorates 
these difficulties. 


Recently, a hybrid method has been developed, 
which uses both marker particles, to reconstruct and 
move the interface, and the volume-fraction function 
to conserve volume. In this approach, a smooth 
motion of the interface, typical of marker methods is 
obtained together with volume conservation, as in 
standard VOF methods. This work improves both 
the accuracy of interface tracking, when compared 
to standard VOF methods, and the conservation of 
mass, with respect to the original marker method. 

Finally, a hybrid method that combines a level 
contour reconstruction technique with front-tracking 
methods has recently been developed to auto- 
matically model the merging and breakup of inter- 
faces in three-dimensional flows. 


Phase-Field Method 


Phase-field, or diffuse-interface, models are an 
increasingly popular choice for modeling the motion 
of multiphase fluids (see Anderson et al. (1998) for a 
recent review). In the phase-field model, sharp fluid 
interfaces are replaced by thin but nonzero thickness 
transition regions where the interfacial forces are 
smoothly distributed. The basic idea is to introduce 
a conserved order parameter (e.g., mass concentra- 
tion) that varies continuously over thin interfacial 
layers and is mostly uniform in the bulk phases (see 
Figure 5). 

For density-matched binary liquids (let p=1 
for simplicity), the coupling of the convective 
Cahn-Hilliard equation for the mass concentration 
with a modified momentum equation that includes a 
phase-field-dependent surface force is known as 
Model H (Hohenberg and Halperin 1977). In the 
case of fluids with different densities a phase-field 
model has been proposed by Lowengrub and 
Truskinovsky. Complex flow morphologies and 
topological transitions such as coalescence and 
interface breakup can be captured naturally and in 
a mass-conservative and energy-dissipative fashion 
since there is an associated free energy functional. 


0.4 
0.2 
0 


—-0.2 
2-19 -1 5 9g 05 1 15 2 


Figure 5 A concentration prome across an interface with 
interface thickness, £. 


The phase field is governed by the following 
advective Cahn-Hilliard equation: 


A +u-Vco=V-(M(c)Vz) [22] 


p= F (e) — &Ac [23] 


where M(c)=c(1—c) is the mobility, F(c)= 
(1/4) (1 — c) is a Helmholtz free energy that 
describe the coexistence of immiscible phases, and 
€ is a measure of interface thickness and e ~ € (see 
Figure 5). It can be shown that in the sharp interface 
limit e—0, the classical Navier-Stokes system 
equations and jump conditions are recovered. 

The singular surface tension force is Fying = 
—6vV2TeV -(Vc & Vc), where 7 is the surface ten- 
sion coefficient. An alternative surface tension force 
formulation based on the CSF is Fying = —6v2TeV- 
(Vc/|Vc|)|Vc|Vc. 

Recently, very efficient nonlinear multigrid meth- 
ods have been developed to solve implicit discretiza- 
tions of the Cahn-Hlilliard equation (e.g., Kim et al. 
(2004)). These schemes have been combined with 
projection methods to solve the Navier-Stokes 
equations to perform simulations of multiphase 
flows. 

An example of simulation of liquid thread breakup 
using a phase-field method is shown in Figure 6. 
A long cylindrical thread of a viscous fluid 1 is in an 
infinite mass of another viscous fluid 2. If the thread 
becomes varicose with wavelength A, the equilibrium 
of the column is unstable, provided A exceeds the 
circumference of the cylinder. This is the Rayleigh 
capillary instability that results in surface-tension- 
driven breakup of the thread. 

An advantage of the phase-field approach is that it 
is straightforward to include more complex physical 
effects. For example, the binary model can be 
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Figure 6 Time evolution leading to multiple pinch-offs. The 
evolution is from top to bottom and left to right. The domain is 
axisymmetric, the initial velocities are zero everywhere, and the 
concentration field is given by c(r,z)=0.5(1 — tanh ((r — 0.5— 
0.05cos(z))/(2/2-9))) on Q=(0,7)x (0,27). Densities are 
matched and viscosity ratio is 0.5. 


straightforwardly extended to describe  three- 
component flows as follows. 

Consider a ternary mixture and denote the 
composition of components 1, 2, and 3, expressed 
as mass fractions, by c1,c2, and c3, respectively. 


Therefore, 


3 
» de E. 


i=] 


0€c; €1 [24] 


The composition of a ternary mixture (A, B, and C) 
can be mapped onto an equilateral triangle (the 
Gibbs triangle (Porter and Easterling 1993)) whose 
corners represent 100% concentration of A, B, or C 
as shown in Figure 7a. Mixtures with components 
lying on lines parallel to BC contain the same 
percentage of A, those with lines parallel to AC have 
the same percentage of B concentration, and 
analogously for the C concentration. In Figure 7a, 
the mixture at the position marked *o' contains 60% A, 
10% B, and 30% C. Because the concentrations sum 
to unity, only two of them need to be determined, 
Say C1, C2. 

The evolution of cı and c2 is governed by the 
following advective ternary Cahn-Hilliard equation: 

Oc; 

Bt +u- Vc = V - (M(c1,c2)V pa) [25] 


(a) (b) 


Figure 7 (a) Gibbs triangle. (b) Contour plot of the free energy 
F(c1, c2) on the Gibbs triangle. 
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ac +u-Vo=V- (M(c1,c2)V u2) [26] 
JF 
a = 9 ter c2) — Ac — 0. Se^ Aci [27] 
(olen 
F 
ua = OF 1, ea) —0.5€&^c(— € ^c, X [28] 
Oc? 


where M(c1,¢2)= ee cic; is the mobility and 
F(c1,c2) is the Helmholtz free energy that can be 
used to model the miscibility of the components. An 
example of a free energy (used in the simulation 
shown in Figure 8 below) for which fluids 1 and 3 
are immiscible and fluid 2 is preferentially miscible 


with fluid 3 is: 


Fia, 62) -2€1(1 — & — e3)* + (cy 4- 0.2)(e2 — 0.2)^ 
+- (1.2 mu ^: m C? )(c2 = 0.4)? 


The contours of F on the Gibbs triangle are shown 
in Figure 7b. 

The singular surface tension force is F,— 
—-6V2eY5 ,nV-(Vc; & Vc), where the physical 
surface tension coefficients 7; between two fluids ; 
and j are decomposed into the phase-specific surface 
tensions 7; such that Tij — 7; + 7. 


As a demonstration of the evolution possible in 
partially miscible liquid systems, we present an 
example in which there is a  gravity-driven 
(Rayleigh-Taylor) instability that enhances the 
transfer of a preferentially miscible contaminant 
from one immiscible fluid to another in 2D. In this 
system, the ternary Cahn-Hilliard system is solved 
using nonlinear multigrid methods and a projection 
method (Kim and Lowengrub (in press)) is used to 
solve the flow equations [5]. 

In Figure 8 (first column), the top half of the domain 
initially consists of a mixture of fluids 1 and 2, 
and the bottom half consists of fluid 3, which is 
immiscible with fluid 1. The contours of ci, c2, and c3 
are visualized in gray-scale where darker regions 
denote larger values of cj,c2, and c3, respectively. 
In the top row, the contours of fluid 1 are shown, the 
middle and bottom rows correspond to fluids 2 and 3, 
respectively. 

Fluid 2 is preferentially miscible with fluid 3. 
Fluid 1 is assumed to be the lightest and fluid 2 the 
heaviest. The density of the 1/2 mixture is heavier 
than that of fluid 3, so the density gradient induces 
the Rayleigh-Taylor instability. 

The evolution of the three phases is shown in 
Figure 8. As the simulation begins, the 1/2 mixture 
falls and fluid 2 diffuses into fluid 3. A characteristic 
Rayleigh-Taylor (inverted) mushroom forms, the 


Figure 8 Evolution of concentration of fluid 1 (top row), 2 (middle row), and 3 (bottom row). The contours of c,, c», and c4 are 
visualized in gray-scale where darker regions denote larger values of ci, c», and cs, respectively. 


surface area of the 1/3 interface increases, and 
vorticity is generated and shed into the bulk. 
As fluid 2 is diffused from fluid 1, the pure fluid 
1 rises to the top as shown in Figure 8. Imagining 
that fluid 2 is a contaminant in fluid 1, this 
configuration provides an efficient means of cleans- 
ing fluid 1 since the buoyancy-driven flow enhances 
the diffusional transfer of fluid 2 from fluid 1 to 
fluid 3. 

The advantages of the phase-field method are: 
(1) topology changes are automatically described; 
(2) the composition field c has a physical meaning 
not only near interface but also in the bulk phases; 
(3) complex physics can easily be incorporated into 
the framework, the methods can be straightforwardly 
extended to multicomponent systems, and miscible, 
immiscible, partially miscible, and lamellar phases 
can be modeled. 

Associated with diffuse interfaces is a small scale 
€, proportional to the width of the interface. In real 
physical systems describing immiscible fluids, € can 
be vanishingly small. However, for numerical 
accuracy € must be at least a few grid lengths in 
size. This can make computations expensive. One 
way of ameliorating this problem is to adaptively 
refine the grid only near the transition layer. Such 
methods are under development by various research 
groups. 

Phase-field methods have been used to model 
viscoelastic flow, thermocapillary flow, spinodal 
decomposition, the mixing and interfacial stretch- 
ing, in a shear flow, droplet breakup process, 
wave-breaking and sloshing, the fluid motion near 
a moving contact line, and the nucleation and 
annihilation of an equilibrium droplet (see the 
references in the review paper Anderson et al. 
(1998)). 


Conclusions and Future Directions 


In this paper we have reviewed the basic ideas of 
interface-tracking and interface-capturing methods 
that are critical in simulating the motion of inter- 
faces in multicomponent fluid flows. The differences 
between these various formulations lie in the 
representation and the reconstruction of interfaces. 
The advantages and disadvantages of the algorithms 
have been discussed. While there has been much 
progress on the development of robust multifluid 
solvers, there is much more work to be done. 
Promising future directions for research include the 
incorporation of adaptive mesh refinement into the 
algorithms and the development of efficient hybrid 
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schemes that combine the best features of individual 
methods. 


See also: Breaking Water Waves; Capillary Surfaces; 
Fluid Mechanics: Numerical Methods; Incompressible 
Euler Equations: Mathematical Theory; Inviscid 
Flows; Non-Newtonian Fluids; Partial Differential 
Equations: Some Examples; Viscous Incompressible 
Fluids: Mathematical Theory; Vortex Dynamics. 
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Introduction 


Intermittency has several meanings in turbulence. 
The oldest one, now most often labeled “external” 
or “large-scale” intermittency, refers to the coex- 
istence of turbulent and laminar regions in inho- 
mogeneous turbulent flows, such as in boundary 
layers or in free shear layers. In those cases, the 
interface between laminar irrotational flow and 
turbulent vortical fluid is typically sharp and 
corrugated. An observer sitting near the edge of 
the layer is immersed in turbulent fluid only part of 
the time. 

The intermittency coefficient y measures the 
fraction of turbulent fluid over the sampling 
universe over which the statistics are taken. For 
example, in a boundary layer such as that in 
Figure 1, the intermittency coefficient as a function 
of wall distance measures the fraction of turbulent 
fluid at a given distance from the wall. External 
intermittency is important in any attempt to model 
realistic turbulent flows, which are almost always 
inhomogeneous. Consider, for example, the classical 
homogeneous relation in eqn [1] between the mean 
kinetic energy K of the turbulent fluctuations and 
the energy dissipation rate €: 


K3/2 
p= © T [1] 


0 4 


Figure 1 Sketch of a turbulent boundary layer, and of the 
associated intermittency factor. An observer such as A, at a 
distance y from the wall, only sees turbulent flow for a fraction + 
of the time. 


Zheng X, Lowengrub J, Anderson A, and Cristini V (2005) 
Adaptive unstructured volume remeshing Il. Application to 
two- and three-dimensional level-set simulations of multiphase 
flow. Journal of Computational Physics 208: 626—650. 


where L is the length scale of the largest eddies, and 
C x 0.1 is an experimentally determined constant. 
Such relations are often implicit in turbulent models, 
and they have to be modified to account for 
intermittency. Equation [1] only holds within the 
turbulent regions where the energy and the dissipa- 
tion rates are Ky and er, while the overall mean 
values used in the modeling conservation equations 
are K=~7Ky and £= yer. The true overall relation 
should therefore be 


/) 
" K3/2 
L 


which may differ substantially from eqn [1], 
especially near the edge of the layer. Experimental 
values and rough theoretical estimates for the 
distribution of the intermittency coefficient are 
available for most practical turbulent flows. 


ë= Oy” [2] 


Internal Intermittency 


While the external intermittency just described is 
probably the most important one from the point of 
view of applications, it is not the most interesting 
from the theoretical point of view. Turbulence is a 
multiscale phenomenon which is inhomogeneous 
at all length scales, from the largest ones to the 
inner viscous cutoff (see Turbulence Theories). 
Moreover, this inhomogeneity goes beyond what 
could be expected just from the statistics of a 
random process. Consider, for example, the velo- 
city difference Au between two points separated 
by a distance r. The original Kolmogorov formula- 
tion of the energy cascade assumes that the 
probability density function (PDF), p(Au), is a 
universal function in the inertial range of scales, 
whose only parameter is a velocity scale depending 
on r. It then follows from Kolmogorov’s analysis 
that 


p(Au) = F| Au/(ér)"”>] 3] 


where £ is the average energy transfer rate across 
scales per unit mass, and the average () is taken 
either over the whole flow or over a suitably designed 
ensemble of experiments. In an equilibrium system, 


global energy conservation implies that Æ is equal to 
the average viscous dissipation per unit mass: 


E = v|Vul- [4] 


In eqn [4], the kinematic viscosity of the fluid is v, and 
IVu| is the L5-norm of the velocity gradient tensor. 
Equation [3] is valid as long as the separation r is 
much larger than the Kolmogorov viscous cutoff 
n= (3/2), and much smaller than the integral 
scale of the largest eddies L. = 4? /£, where u’ is the 
root-mean-square value of the fluctuations of one 
velocity component. The extent of this inertial range 
is a function of the Reynolds number Re; — w'L. /v: 


L./n = Re? [5] 


The strict similarity hypothesis in eqn [3] is not well 
satisfied by experiments. While the velocity distribu- 
tion at a given point is approximately Gaussian, 
Figure 2a shows that the velocity increments become 
increasingly non-Gaussian as the spatial separation 
is made much smaller than L.. It was also soon 
noted that the dependence of eqn [3] on a single 
parameter such as Æ was theoretically suspect, since 
it is difficult to see how the PDFs of a whole set of 
local properties, such as the Aw for different 
intervals, could depend only on a single global 
property. Kolmogorov himself sought to bypass that 
difficulty by substituting eqn [3] by a “refined 
similarity" hypothesis, 


p(Au) = F Au (er)? | 6 


where e, is no longer a global average, but the mean 
value of the dissipation over a ball of radius of order 
r centered at the midpoint of the interval. This 
refined similarity is better satisfied by experiments 


(a) 
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(see Figure 2b), although, from the practical point of 
view, it just transfers the problem of characterizing 
Au to that of characterizing the statistics of e,. 

It has become customary to measure the behavior 
of p(Au) in terms of its structure functions, 


S(n) = J — Au"p(Au)dAu 7] 


which can be normalized as generalized flatness 
factors, 


a(n) =S(n)/S(2)"" [8] 
It follows from the strict similarity hypothesis [3] that 
S(n) ^ r"? [9] 


and that all the o(m) should be independent of the 
separation. 

For example, the fourth-order flatness of a 
Gaussian distribution is o(4)=3. Figure 3 shows 
that this is not true. The flatness increases as the 
separation decreases, and it only levels off at lengths 
of the order of the Kolmogorov viscous scale. For 
separations in that viscous range the flow is smooth, 


Au £z (O,u)r, and 


—— 57/2 


a(n) ~ (Oxu)"/ (Ot) [10] 


It follows from eqn [10] and from Figure 3 that the 
velocity gradients become increasingly non-Gaussian 
as L. and 7 separate at high Reynolds numbers. The 
velocity differences across intervals which are large 
with respect to 7 also become very non-Gaussian 
when r < L.. 

Because the velocity difference between two 
points which are not too close to each other can be 
expressed as the sum of velocity differences over 
subintervals, a loose application of the central limit 


(b) 


Figure 2 PDFs of the differences of the velocity component in the direction of the separation (for separations in the inertial range of 
scales). r/L. =0.02—0.36, increasing by factors of 2; equivalent to r/7 — 180—3000. Nominally isotropic turbulence at Reynolds 
number Re, = 10?.. (a) Au is normalized with the global energy dissipation rate £; distributions are wider as the separation decreases. 
(b) Au is scaled with the locally averaged dissipation over the separation interval. Data courtesy of H Willaime and P Tabeling. 
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Figure 3 Fourth-order flatness of the differences of the 
velocity component in the direction of the separation, for 
separations in the inertial range of scales, r/L-=0.5 to 
r/n=2.. The Reynolds numbers of the different flows range 
from Re, — 1800 to 108.. Data in part courtesy of H Willaime, 
P Tabeling, and R A Antonia. 


theorem would suggest that its PDF should be 
roughly Gaussian. The key conditions for that to 
happen are that the summands should be mutually 
independent, that their magnitudes should be com- 
parable, and that each of them has a probability 
distribution with a finite variance. The first of those 
three conditions is probably a good approximation 
if the separation is much longer than the viscous 
cutoff, but the second one depends on the structure 
of the flow. The experimental non-Gaussian beha- 
vior suggests the existence of occasional very strong 
velocity jumps. In the viscous range of scales, those 
structures have been identified both experimentally 
and numerically as very strong linear vortices, in 
whose neighborhoods the strongest gradients are 
generated. An example of a tangle of such structures 
is shown in Figure 4. 

In another example, the vorticity in decaying 
two-dimensional turbulence concentrates very 
quickly into relatively few strong compact vortices, 
which are stable except when they interact with 
each other. The velocity field is dominated by them, 
and the flatness of the velocity increments reaches 
values of the order of 6oe(4)2 50-100, even at 
moderate Reynolds numbers. That case is interest- 
ing because something can be said about the 
probability distribution of the velocity gradients. 
We have noted that the PDF of a sum of mutually 
comparable independent random variables with 
finite variances tends to Gaussian when the number 
of summands is large. This well-known theorem is a 
particular case of a more general result about sums 
of random variables whose incomplete second 
moments diverge as 


m(s)= | x*p(x)dx — s^ whens—oco [11] 


E 


Figure 4 Intense vortex tangle in the logarithmic layer of a 
turbulent channel. The vortex diameters are of the order of 10n, 
and the size of the bounding box is of the order of the channel 
width. Reproduced with permission of J C del Alamo. 


When 0 « o < 2, the sums of such variables tend 
to a family of “stable” distributions parametrized by 
a. The Gaussian case is the limit of that family when 
a — 2. In the case of two-dimensional vortices with 
very small cores, the velocity gradients at a distance 
R from the center of the vortex behave as 1/R?. If 
we take s in eqn [11] to be one of those velocity 
derivatives, its probability distribution is propor- 
tional to the area covered by gradients with a given 
magnitude, and 


R^24RdR ~ s^! [12] 


The velocity derivatives at any point, which are 
the sums of the velocity derivatives induced by all 
the randomly distributed neighboring vortices, 
should therefore be distributed according to the 
stable distribution with a= 1, which is Cauchy’s 


C 
m(c* + s?) 


p(s)= 


This distribution has no moments for n 1. Its 
tails decay as s^, and the distribution of the 
gradients essentially reflects the properties of the 
closest vortex. In real two-dimensional turbulent 
flows, the distribution [13] is followed fairly well, 
but its extreme tails only reach to the maximum 
values of the velocity gradient found within the 
viscous vortex cores, which are not exactly point 
vortices. 


Other similar general results can be derived that 
link the behavior of the structure functions with the 
properties of the stable distributions corresponding 
to the type of flow singularities expected in the limit 
of infinite Reynolds number. 

The common feature of the two cases just 
described is the presence of strong structures that 
live for long times because viscosity stabilizes 
them. They are therefore more common than 
what could be expected on purely statistical 
grounds. They are responsible for the tails of the 
probability distributions of the velocity derivatives, 
but they are not the only intermittent features of 
turbulent flows. The increase of the flatness in 
Figure 3 below rz 50 is clearly connected with 
the presence of the coherent vortices, but even for 
larger separations there is a smooth evolution of 
o(4) that suggests that the formation of intense 
structures is a gradual process that takes place 
across the inertial range. Much less is known 
about those hypothetical inertial structures than 
about the viscous ones. 

We can now recast the problem of intermittency 
in Navier-Stokes turbulence into geometric terms. 
The defining empirical observation for that system is 
that the energy dissipation given by eqn [4] does not 
vanish even in the infinite Reynolds number limit in 
which v —0. This means that the flow has to 
become singular as |Vu\L,/u' ~ Rej! ^. The strict 
similarity approximation assumes that those singu- 
larities are uniformly distributed across the flow, but 
the experimental evidence just discussed shows that 
this is not true. The singularities are distributed 
inhomogeneously, and the inhomogeneity develops 
across the inertial cascade. The problem of inter- 
mittency is to characterize the geometry of the 
support of the flow singularities in the limit of 
infinite Reynolds number. 

In the absence of detailed physical mechanisms 
for the dynamics of the inertial range, most 
intermittency models are based on plausible pro- 
cesses compatible with the invariances of the 
inviscid Euler equations. The precise power law 
given in eqn [9] for the structure functions depends 
on the strict similarity hypothesis [3], but the fact 
that it is a power law only depends on the scaling 
invariances of the equations of motion. The 
energies and sizes of the eddies in the inertial 
range are too small for the integral scales of the 
flow to be relevant, and too large for the viscosity 
to be important. They therefore have no intrinsic 
velocity or length scales. Under those conditions, 
any function of the velocity which depends on 
a length has to be a power. Consider a quantity 
with dimensions of velocity, such as u(r) =S(n)!/", 
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which is a function of a distance such as r. 
On dimensional grounds we should be able to 
write it as 


u(r) = UF(p) (14] 


where p=r/L, and L and U(L) are arbitrary length 
and velocity scales. The value of u(r) should not 
depend on the choice of units, and we can 
differentiate eqn [14] with respect to L to give 


Lu =(dU/dL)F(p) — UpL (dF/dp) 20 X [15] 
which can only be satisfied if 


dF À 

— = F ~ p° 6 
i (F>F ~ p [16] 
and ¢=L(dU/dL)/U is constant. This suggests 
generalizing eqn [9] to 


S(n)~rS™ [17] 


where the exponents are empirically adjusted. Only 
¢(3)=1 can be derived directly from the Navier- 
Stokes equations. Equation [17] implies that o(z) 
satisfies a power law with exponent C(z) — n¢(2)/2. 
In Figure 3, for example, the flatness follows a 
reasonably good power law outside the viscous 
range, consistent with C(4) — 26(2)z —0.12. The 
anomalous behavior near the viscous limit, and 
similar limitations at the largest scales, mean that 
only very high Reynolds number flows can be used 
to measure the scaling exponents, and that the range 
over which they are measured is never very large. 
Moreover, the integrand of the higher-order struc- 
ture functions peaks at the extreme tails of the 
probability distributions of the velocity differences, 
which implies that very long experimental samples 
have to be used to accumulate enough statistics to 
measure the high-order exponents. For these and for 
other reasons, the scaling exponents above n > 8— 10 
are poorly known. This is unfortunate because we 
will see later that some of the most interesting 
intermittency properties of the velocity field, such as 
the nature of the flow singularities in the infinite 
Reynolds number limit, depend on the behavior of 
the C(z)) for large n. 

Experimental values for the scaling exponents are 
given in Table 1. They are generally smaller than the 
ones predicted by the strict similarity approxima- 
tion, implying that the moments of the velocity 
differences decrease with the separation more slowly 
than they would if they were self-similar, and 
suggesting that new stronger structures become 
important as the scale decreases. 

Note that we have included in the table values for 
odd-order powers. Up to now we have not specified 
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Table 1 Longitudinal scaling exponents 


Order Experimental Strict similarity 
2 0..70 4 ..01 0.667 

3 1.00 1 

4 1..30 + ..03 1.333 

5 1..56 + ..04 1.667 

6 1.79+..63 2.000 

7 1.99 + ..10 2.333 

8 2.22 4.05 2.667 


The values on the second column are averages from different 
experiments, and the standard deviations reflect scatter among 
experiments. The third column is the value from the strict 
similarity equation [9]. 


which velocity component is being analyzed, but 
most experiments refer to the one in the direction 
of the separation. That is the easiest case to 
measure, specially if time is used as a surrogate 
for distance, and those PDFs are not symmetric 
even in isotropic turbulence. Negative increments 
are more common than positive ones because of the 
extra energy required to stretch a vortex, and the 
effect is clearly visible in the distributions in 
Figure 2. Those longitudinal odd-order structure 
functions do not vanish, and their scaling expo- 
nents are the ones given in the table. The transverse 
structure functions are those in which the velocity 
component is normal to the separation, and their 
odd-order moments vanish by symmetry in iso- 
tropic turbulence. There has been a lot of discus- 
sion about whether the longitudinal scaling 
exponents of even orders differ from the transverse 
ones. Early results suggested that the latter are 
lower than the former, undermining the case for 
intermittency theories based on similarity argu- 
ments, and suggesting that a more mechanistic 
approach was needed. The present consensus 
seems to be that both sets of exponents are 
equal, but that there are residual effects of low 
Reynolds numbers and of flow anisotropy that are 
difficult to avoid experimentally. The question is 
still open. 


Multiplicative Models 


The most successful phenomenological models for 
the geometry of intermittency are based on the 
concept of a multiplicative cascade. Consider some 
flow property v, such as the locally averaged 
energy transfer rate by eddies of size r, which 
cascades into smaller eddies of size r,,, which is 
some fraction of r,. Denote by p(vą) the 


probability distribution of the value of v at the 
step k of the cascade. 

Assume that the cascade is Markovian in the sense 
that the probability distribution of v; depends only 
on its value in the previous step, 


—-— / ter (ates logs Rieti) di — {081 


This is in contrast to some more complicated 
functional dependence, such as on the values of v; 
in some extended spatial neighborhood, or on 
several previous cascade stages. This assumption 
intuitively implies that v;,, evolves faster, or on a 
smaller scale, than v,, and that it is in some kind of 
equilibrium with its precursor. If the cascade is 
deterministic in that sense, v, can be represented as 
a product 


UL /VQ = QkQk—1--- Q1 [19] 


in which the factors g,=v,/v,_; are statistically 
independent of each other. 

If the underlying process is invariant to scaling 
transformations, the transition probability density 
function has to have the form 


pi(vgalvg) = v, W(des 1; R) [20] 


The multiplicative model works most naturally 
for positive variables, and we will assume that 
to be the case in the following, but most results 
can be generalized to arbitrary distributions. We 
will also assume for simplicity that all the 
cascade steps are equivalent, so that the distribu- 
tion w(q) of the multiplicative factors is indepen- 
dent of k, and depends only on our choice for 
fk+1/fk- 

Local deterministic self-similar cascades lead 
naturally to intermittent distributions, in the sense 
that the high-order flatness factors for v, become 
arbitrarily large as k increases. It follows from eqns 
[18]-[20] that the nth order moment for p, can be 
written as 


S.(n)= | €p,(£)d£—So(n)S,(n [21] 


where $,(") is the mth order moment of the 
multiplicative factor q, and n is any real number 
for which the integral exists. If we define flatness 
factors as in eqn [7], we can rewrite eqn [21] as 


a,(n) = ceo(n)o, (n) [22] 
It follows from Chebichev's inequality that 


S(n) > S(n — 2)S(2) > S(n—4)S(2) ... | [23] 


from where 
1 € o(4) € o(6)... [24] 


which is true for any distribution of positive 
numbers. Equality only holds for trivial distributions 
concentrated on a single value. The product in eqn 
[22] therefore increases without bound with the 
number of cascade steps, and the flatness factors 
diverge. 

It is tempting to substitute k in [21] by a 
continuous variable, in which case the PDFs form 
a continuous semigroup generated by infinitesimal 
scaling steps. This leads to beautiful theoretical 
developments, but it is not necessarily a good idea 
from the physical point of view. For example, while 
it might be reasonable to assume that the properties 
of an eddy of size r depend only on those of the 
eddy of size 2r from which it derives, the same 
argument is weaker when applied to eddies of 
almost equal sizes. We will restrict ourselves here 
to the discrete case. 


Limiting Distributions 


The multiplicative process just described can be 
summarized as a family of distributions p,(v,) such 
that the probability density for the product of two 
variables is 


P (Uki Ut, ) — Di, sk Uk, 44) [25] 


and it is natural to ask whether there is a limiting 
distribution for large k. We know that, in the case of 
sums, rather than products, such distributions tend 
to be Gaussian under fairly general conditions, and 
the first attempt to analyze [25] was to reduce it to a 
sum by defining 


z= k ' log(vy/vo) |26] 


The argument was that z would tend to a Gaussian 
distribution, and that the limiting distribution for v; 
would be lognormal. This was soon shown to be 
incorrect. The central part of the distribution 
approaches lognormality, but the tails do not, 
because the central limit theorem says nothing 
about their behavior. The family of lognormal 
distributions is a fixed point of eqn [25], but it is 
unstable, and it is only attained if the individual 
generating distributions are themselves lognormal. 
The lognormal distribution has moments 


S, (n) = exp(an + bn?) [27] 


which are conserved under [21], so that the product 
of lognormally distributed variables stays lognormal. 
The moments in eqn [27] are generated by the 
recursive relation 
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Sw(n + 3)S3 (n + 1) 
S, (n)S5 (n +A) 


with suitable conditions for n < 2. Under [21], 
O,(n) = O* (n), and it is clear that only when all 
the O,,(n) are exactly equal to 1 do they continue to 
be so under multiplication. Otherwise, any Qw 
initially larger than 1 tends to infinity after enough 
cascade steps, while any one initially smaller than 1 
tends to 0. Only an exactly lognormal distribution 
of the generating factors results in a lognormal 
limiting distribution, and even small errors lead to 
very different patterns of moments. This contrasts 
with the situation for sums of random variables, in 
which the Gaussian distribution is not only a fixed 
point, but also has a very large basin of attraction. 


Qy(n)— =1 [28] 


Multifractals 


The problem with using the transformation [26] to 
find the limiting distribution of a multiplicative 
process is not so much the technique of analyzing 
the statistics of products in terms of those of sums, 
but the inappropriate use of the central limit 
theorem. It can be bypassed by using instead the 
theory of large deviations of sums of random 
variables. The key result is obtained by expanding 
the characteristic function of p, when k > 1, and 
states that 


= 95 at 
nn) (52) genes [29] 


where z is defined as in [26] and $, which plays the 
role of an entropy, is a smooth function of z. Primes 
stand for derivatives with respect to z. Let us define 
Zn as the point where 


dl, = (Za) = —n (30) 


which corresponds to the location of the maximum 
of o + nz. The entropy ó can be computed from the 
moments of the transition probability density. Using 
Laplace's method to expand the mth moment of pj, 
we obtain 


S,(n) af kek p, (vp) dz 


A 1/2 
x E) e (Ont men) [31] 
from where, using [21], 
An = log S,(1) = (Zn) + nz, [32] 


The essence of Laplace’s approximation is that, for 
k œ 1, most of the contribution to the integral in 
eqn [31] comes from the neighborhood of z,, so that 
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it makes sense to consider each such neighborhood 
as a separate “component” of the cascade. 

The geometric interpretation of this classification 
into components as a multifractal was developed in 
the context of three-dimensional homogeneous 
turbulence. We have up to now assumed very little 
about the nature of each cascade step, but it is 
natural in turbulence to interpret it as the process in 
which eddies decay to a smaller geometric scale. The 
argument works for any variable for which scale 
similarity can be invoked, but we have seen that 
most experiments are done for the magnitude of the 
velocity increments across a distance r. If we 
assume for simplicity that r,/r,,;—e, so that 
r,/ro = exp(—k), eqns [26] and [29] can be written as 


vy/vo = (ru/ro) ^, — Pe(Zn) ~ (ru/ro) ^" [33] 


The multifractal interpretation is that the *compo- 
nent" indexed by n, whose velocity increments are 
“singular” in terms of r with exponent z,, lies on a 
fractal whose volume is proportional to its prob- 
ability, and which therefore has a dimension 
D(25) —3 + On. 

Note that eqn [32] implies that the scaling 
exponents in eqn [17] can now be expressed as 


C(n) = —log S,(n) = —A, [34] 


There was an enumeration there of several things 
which are equivalent: the exponents, the spectra, the 
distribution, and the limiting distribution p4(v) — 
univocally determine each other. Note however that 
different quantities have different scaling exponents. 
For example, it follows from eqn [6] that, if the 
scaling exponents for the local dissipation are 
C.(n) the exponents for Au would be 
GAu t) - n/3 " Ge (1/3). 

Some properties can be easily derived from the 
previous discussion. If we assume, for example, that 
the multiplicative factor q is bounded above by qp, 
which is reasonable for many physical systems, eqn 
[26] implies that z,, € log qp. In fact, if the transition 
probability behaves near q, as w(q)~(qy — q)”, the 
scaling exponents tend to 


An =n log qa —(8--1)log n+ O(1) [35] 


for nœ 1. In the case in which w(q) has a 
concentrated component at g=q,, the log» is 
missing in eqn [35]. In all cases, the singularity 
exponent of the set associated with n — oo is 
Zœ = log qp, because the very high moments are 
dominated by the largest possible multiplier. In the 
case of a concentrated distribution the dimension of 
this set approaches a finite limit, but otherwise 


D(n)x-—(8 + 1)logz |36] 


which becomes infinitely negative. This should not 
be considered a flaw. The set of events which only 
happen at isolated points and at isolated instants has 
dimension D= —1 in three-dimensional space, and 
those which only happen at isolated instants, and 
only under certain circumstances, have still lower 
negative dimensions. Sets with very negative dimen- 
sions are however extremely sparse, and are difficult 
to characterize experimentally. 

The multifractal spectrum of the velocity differ- 
ences in three-dimensional Navier-Stokes turbulence 
has been measured for several flows in terms of the 
scaling exponents, and appears to be universal. The 
probability distribution w(q) of the multipliers has 
also been measured directly, and agrees well with 
the values implied by the exponents. It is also 
approximately independent of r, although not 
completely, perhaps due to the same experimental 
problems of anisotropy and limited Reynolds 
number which plague the measurement of the 
scaling exponents. There has been extensive theore- 
tical work on the consequences of imposing various 
physical constraints on the multipliers, specially the 
conservation requirement that the average value of 
the dissipation has to be conserved across each 
cascade step. Several simple models have been 
proposed for the transition distribution which 
approximate the experimental exponents well, but 
the relation lacks specificity. Models that are very 
different give very similar results, and it is impos- 
sible to choose among them using the available data. 

Multiplicative cascades and the resulting inter- 
mittency are not limited to Navier-Stokes turbu- 
lence. The equations of motion have only entered 
the discussion in this section through the assumption 
of scaling invariance. Multifractal models have in 
fact been proposed for many chaotic systems, from 
social sciences to economics, although the geometric 
interpretation is hard to justify in most of them. It is 
also important to realize that the fact that a given 
process can in principle be described as a cascade 
does not necessarily mean that such a description is 
a good one. Neither does a cascade imply a 
multiplicative process. For each particular case, we 
need to provide a dynamical mechanism that 
implements both the cascade and the transition 
multipliers. In  three-dimensional | Navier-Stokes 
turbulence, the basic transport of energy to smaller 
scales and to higher gradients is vortex stretching. 
The differential strengthening and weakening of the 
vorticity under axial stretching and compression 
also provide a natural way of introducing the self- 
similar transition probabilities of the local dissipation. 

Examples of nonintermittent cascades abound. 
We have already mentioned that the vorticity in 
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decaying two-dimensional turbulence gets concen- 
trated into stable vortex cores which eventually 
block the decay. The resulting enstrophy distribu- 
tion is highly intermittent, but it is not well 
described by a multifractal. Conversely, forced 
two-dimensional turbulence is dominated by an 
inverse energy cascade to larger scales, which is not 
intermittent. 

In addition, the intermittency of some systems is 
not a small-scale effect. Turbulent mixing of a 
passive scalar, which is the key process in 
turbulent heat transfer and in the atmospheric 
dispersion of pollutants, is an extremely intermit- 
tent phenomenon. The gradients of the scalar tend 
to be very localized, but they concentrate in sheets, 
narrow in thickness but otherwise extended. Some 
progress has recently been made on a simplified 
model due to Kraichnan for this problem, which is 
the linear stirring of a passive scalar by a random 
noise with delta correlation. Its statistics have been 
computed analytically, but the constraints of 
linearity and of uncorrelated forcing are strong, 
and the same methods do not appear to be 
extensible to mixing by real turbulence (see 
Lagrangian Dispersion (Passive Scalar)). Another 
problem in which intermittency is confined to 
large-scale surfaces is the motion of a three- 
dimensional pressureless gas, which has been used 
as a model for hypersonic turbulence and for the 
large-scale evolution of dark matter in the early 
universe. 

In summary, intermittency is a fascinating property 
of many random systems, including three-dimensional 
Navier-Stokes turbulence, which interferes, sometimes 
strongly, with their description by simple cascade 
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Introduction 


Intersection theory is the theory that governs the 
rigorous definition of intersections of cycles. This 
can take place in a variety of mathematical contexts, 
for instance, the intersections of two cycles on an 
oriented manifold in algebraic topology, of two 
currents on a differentiable manifold in differential 
geometry, or of two subvarieties on a nonsingular 
algebraic variety in algebraic geometry. 


models. Significant advances have been made in its 
quantitative kinematic analysis. In some cases we also 
have a qualitative understanding of its roots. But in very 
few cases do we understand it well enough to make 
quantitative predictions. 


See also: Ergodic Theory; Incompressible Euler 
Equations: Mathematical Theory; Lagrangian Dispersion 
(Passive Scalar); Turbulence Theories; Vortex Dynamics; 
Wavelets: Applications. 
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In algebraic geometry the theory is especially well 
developed (Fulton 1998). A cycle on an algebraic 
variety (or scheme) is a formal linear combination of 
irreducible closed subvarieties. These are subject to 
an equivalence relation called rational equivalence. 
For every rational function on every subvariety, its 
zero set is deemed rationally equivalent to its poles 
(with appropriate multiplicities). 

As an example, in the complex projective plane 
CP’, any two lines are rationally equivalent since 
the ratio of two linear forms will vanish on one line 
and have a pole along the other. Similarly, a curve 
of degree d is rationally equivalent to d lines. Any 
two points in CP? can be joined by a line (a copy of 


152 Intersection Theory 


CP'), and a rational function on CP! can be chosen 
to vanish at one point and have a pole at the other. 
The groups of cycles modulo rational equivalence, 
known as Chow groups, are 


CH3(CP^) & Z, generated by the fundamental 


class [CIP?] 


CH,(CP*) = Z, generated by the class of a line 


CHo(CP7^) = Z, generated by the class of a point 
Two distinct lines 44 and /; meeting at a point p have 
this point as their intersection-theoretic product: 


i] - [42] = [p] |] 
Intersection theory must also provide a self-intersection 
[/1] - [41]. Because @; and £5? are rationally equivalent, 
this must also be the class of a point, but symmetry 
precludes the choice of a distinguished point on 4. 
Instead, [/1]- [/1] is declared to be the rational 
equivalence class of a point on 44, an element of 
CHo(/1) rather than a specific cycle. This example 
illustrates that intersections cannot generally be defined 
on the level of cycles. 


Algebraic Intersection Products 
Refined Intersections 


For a general nonsingular variety X, say of dimen- 
sion m, if U and V are subvarieties of X of respective 
dimensions c and d, then there is a refined 
intersection product 


[U] l [V] € CH 4 g-m(U N V) n 


The traditional definition. of the intersection 
product is based on two ideas. First, given two 
cycles that intersect properly, which by definition 
means that no component of their intersection has 
codimension less than the sum of the codimensions 
of the given cycles, the intersection product should 
be a formal sum of these components, each with a 
multiplicity that correctly reflects the geometry of 
the intersection. Second, given two arbitrary cycles, 
it should be possible to replace one of them by a 
rationally equivalent cycle which intersects the other 
properly. 

While these ideas are simple, it took several 
decades for them to be carried out successfully. 
The case of curves on a surface meeting at a point 
was understood in the nineteenth century. General- 
izing the classically understood canonical divisor 
class on a variety, work in the 1930s by Severi, 
Todd, and others showed that there are groups of 
equivalence classes of cycles in which canonical 


invariants of higher degrees can be defined (in 
modern language, higher Chern classes of the 
tangent bundle). Weil’s foundations for algebraic 
geometry of the 1940s included a study of intersec- 
tions of cycles. It was not until the 1950s that the 
notion of Chow groups was formalized and inter- 
section theory was properly developed in this 
context. Chevalley, Chow, Samuel, Severi, and 
others contributed essential components of the 
theory. In an interesting parallel development, an 
intersection theory based on intersection multipli- 
cities in algebraic topology was put forth by 
Alexander and Lefschetz in the 1920s, a decade 
before the introduction of the cup product in 
cohomology. 


Deformation to the Normal Cone 


In the 1970s, Fulton and MacPherson established a 
construction of the intersection product in algebraic 
intersection theory that does not require moving 
cycles into general position. To accomplish this, they 
used an elegant geometric construction known as 
deformation to the normal cone. 

Let i: X — Y be an embedding of codimension d 
of nonsingular varieties. Let V be a subvariety of Y 
of dimension k whose intersection with X is of 
interest. We may view X as the zero set of a section s 
of some algebraic vector bundle E on Y. By 


(y, A) 5 (A^! sy), A) 


we have a map of the product of Y with the 
punctured affine line, Y x (A! V (0]), into E x Al. 
We denote the closure of the image by M\Y. An 
alternative, more intrinsic description is in terms of 
the blowup construction of algebraic geometry: 


MSY = Blx.ioy(Y x A!) 


Geometrically, MS. Y has a copy of Y over each \ 4 0 
and a copy of the normal bundle Nx Y over A = 0. This 
is the key construction that Fulton and MacPherson 
make use of. The same construction applied to V, that 
is, the closure of V x (A! X (0]) in M5 Y, has over 0 a 
sort of singular normal bundle known as the normal 
cone 


Cxnv V C NxY|xay 


One of the properties of Chow groups is that they 
are unchanged upon pullback to the total space of a 
vector bundle (apart from the obvious dimension 
shift). The refined intersection of V with X, denoted 
[V], is defined to be the unique element of 
CH, 4(X N V) whose pullback to NxY is equal to 
[Cxnv V]. 


This single construction encompasses and inter- 
polates between two extreme cases of intersections: 


i[V| [XO V| when X and V 


meet transversely 


i3] 


i[V| = ca(NxY) n [V] when Vc X |4] 


Equation [3] makes reference to transverse inter- 
section, a notion that is stronger than proper 
intersection. In situations when it applies, for 
example, in eqn [1], it signifies that intersection 
operations behave as one might expect. Equation 
[4] includes the self-intersection formula which says 
that [X] - [X] is equal to the top Chern class of 
NxY. 

With this construction, which is well documented 
in Fulton (1998), the general refined intersection in 
eqn [2] is obtained by reduction to the diagonal. Let 
Ax denote the diagonal inclusion X — X x X of the 
nonsingular variety X. For subvarieties U and V of 
X, we define 


[U] - [V] = A&[U x V] [5] 


Equation [5] makes the Chow groups of X into a 
ring, the Chow ring CH'(X), which is graded by 
codimension by setting 


CH*(X) = CH, (X) 


Links with Topology 
Cycle Map to Homology 


For algebraic varieties over the complex numbers, 
there is a cycle map which links the Chow groups 
with a topological homology group. If X is an 
algebraic variety over C, then let H,(X) denote the 
Borel-Moore homology of, X, that is, the homology 
of locally finite singular chains on X (viewed as a 
topological space with the classical topology). If X is 
embedded as a closed subset of an oriented 
differentiable manifold M, then there are 
identifications 


H;(X) = H”(M, MY X) 6 


where n is the dimension of M. There is a cycle class 
map 


CH,4(X) — H(X) 


which sends the class of each irreducible subvariety 
- Z of dimension k in X to its fundamental class 
[Z] € H5, (X). 

Let M be an oriented differentiable manifold of 
dimension z and let X and Y be closed subsets of M. 
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Then the cup product H'(M, M\ X) @ H(M, MVY) 
— H'"(M,MN(XnY)) induces, via eqn [6], an 
intersection product 


Hj(X) & Hj(Y) ^ Hi+j-n(X N Y) 


which is the topological analog of the refined 
intersection product of eqn [5]. The products are 
compatible via the cycle class map. The topology of 
complex algebraic varieties and the compatibilities 
between algebraic and topological intersections are 
discussed in Fulton (1998). An interesting applica- 
tion of this interplay of intersection theories is the 
convolution product in Borel-Moore homology, 
which is important in geometric representation 
theory (see Chriss and Ginzburg (1997)). 


Riemann-Roch Theorems 


The classical Riemann-Roch theorem relates the 
dimensions of linear systems on an algebraic curve 
(algebraic quantities) with their degrees and 
the curve's genus (topological quantities). The 
Hirzebruch-Riemann-Roch theorem states that on 
a nonsingular projective variety X, if E is an 
algebraic vector bundle on X and x(E) denotes its 
Euler characteristic (the alternating sum of the ranks 
of the sheaf-theoretic cohomology groups), then 


x (E) = | ch(E) - td(Tx) 7 


where fẹ denotes the degree of the zero-dimensional 
component of the quantity that follows, and the Chern 
character ch(E) and Todd class td(Tx) are certain 
standard universal polynomials of Chern classes. 

Grothendieck had the inspired idea that eqn [7] 
could be generalized to a covariance property for the 
Chern character times the Todd class. If X and Y are 
nonsingular varieties and f: X — Y is a projective 
morphism (or, more generally, a proper morphism), 
then there is a well-defined push-forward f. on 
Chow groups. There is also a kind of push-forward 
for vector bundles. The Grothendieck group of 
vector bundles on X, denoted K°(X), is the group 
of formal linear combinations of vector bundles, 
modulo the relations [E] = [E'] + [E"] whenever E' is 
a sub-bundle of E with quotient bundle E". Every 
coherent sheaf F has a well-defined class in K?(X), 
namely, the alternating sum of [E;] where E, is any 
finite resolution of F by vector bundles (locally free 
sheaves). The push-forward f,[E] is defined as the 
alternating sum of the classes in K°(Y) of the higher 
direct images R'£,E. The Grothendieck-Riemann- 
Roch theorem states that 


ch(f,[E]) -td(Ty) = f.(ch(E):td(Tx)) — [8| 
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in CH,(Y) & Q. Notice that eqn [7] represents the 
case that Y is a point. 

There is an even more general formulation valid 
for singular varieties. It is necessary to work with a 
homology version of the Grothendieck group, 
namely, the Grothendieck group Ko(X) of coherent 
sheaves on X. The Baum-Fulton-MacPherson ver- 
sion of the Grothendieck-Riemann-Roch theorem 
prescribes transformations 


Tx : Ko(X) —^ CH,(X) 89 Q [9] 


which are covariant for proper morphisms. When 
X is nonsingular, 7x is given by the “Chern 
character” times the “Todd class”, and covariance 
becomes eqn [8]. 

In the case of varieties over the complex numbers, 
there is also a transformation from the algebraic 
Grothendieck group Ko(X) to a topological analog, 
satisfying various compatibilities. The composition 
with the homology Chern character gives Riemann- 
Roch transformations Ko(X) — H,(X;Q) satisfying 
properties akin to those of eqn [9]. 


The Analytic Setting 


The Atiyah-Singer index theorem stands as 
an important generalization of the Hirzebruch- 
Riemann-Roch theorem. The index of an elliptic 
differential operator on a differentiable manifold 
plays the role of the Euler characteristic, and is 
equated with a topological quantity. One of the 
consequences of the index theorem is the validity of 
eqn [7] for general compact complex manifolds. 

More in the domain of pure analysis is the 
question of intersecting two currents on a differenti- 
able manifold. Currents arise naturally out of 
Chern-Weil theory. To each current is associated a 
wave front, a subset of the cotangent bundle that 
reflects the geometry of the singular set of the 
current. À current can be pulled back to an 
embedded submanifold whenever the embedding is 
transverse to the wave front. By reduction to the 
diagonal, this gives an intersection of two currents 
with transverse wave fronts which reduces to the 
usual wedge product in the case of smooth differ- 
ential forms (see Hórmander (1990)). 


Applications of Intersection Theory 
Enumerative Geometry 


Intersection theory has proved to be a useful tool in 
diverse areas such as enumerative geometry, singular- 
ity theory, and moduli problems. Enumerative pro- 
blems have intrigued generations of geometers. 
Chasles, Maillard, Schubert, and Zeuthen are among 
the geometers of the second half of the nineteenth 


century who solved an impressive array of problems, 
including, as a notable example, Steiner's five conics 
problem to determine the number of plane conics 
tangent to five given conics in general position. 

In modern terms, the successful solution to an 
enumerative problem involves setting up a space which 
parametrizes the geometric objects being counted, 
suitably compactified, and carrying out an intersec- 
tion-theoretic computation on this space. Steiner's 
problem illustrates how “excess intersection” can 
occur and cause difficulty. Inside the CP? of plane 
conics, including degenerate conics, those tangent to a 
given conic constitute a sextic hypersurface. So 
6? = 7776 would appear plausible; this was, in fact, 
the originally proposed solution. However, the most 
degenerate conics, the double lines, all appear as limits 
of families of conics tangent to any given conic. The 
refined intersection of five of these sextics has a cycle 
class of degree 4512 supported on the Veronese 
surface of double lines. This leaves 3264, the correct 
answer given by Chasles in 1864. The issue of 
providing rigorous foundations for these kinds of 
calculations was recognized by Hilbert, who set it as 
the 15th of his 23 major mathematical problems 
outlined in 1900. A good survey of early and modern 
efforts in enumerative geometry can be found in 
Kleiman and Thorup (1987). 


Singularity Theory and Degeneracy Loci 


In any situation. where a geometric object is 
described by parameters, there will be values of the 
parameter at which the geometry changes qualita- 
tively. The significance of this is evident in the space 
of conics above. Singularity theory is concerned with 
the loci in parameter spaces on which these 
transitions can occur. Let 7: Y — P be a map of 
differential manifolds, or of nonsingular algebraic 
varieties, which is generally (but not everywhere) 
submersive, so that there are singular fibers. Let d 
denote the dimension of P, which can be considered 
as a parameter space, and let c be the dimension of 
Y. Consider the loci 


169 — ly c Y | rk(Ty.v =} Triy)P) < d — k} 


of singularity theory. Thom made an influential 
study of these in the 1950s, and Porteous in 1971 
gave the following formula, now called the Thom- 
Porteous formula: 


Boleen T — [10 


The symbol on the right is shorthand for 
Sik+c-d...k+c-djs the case a1 = --- =ap=k+c-—d of 
the Schur determinant Sja, ap) = det (Saij=i)i ci j<ks 
and for vector bundles E and F the s;(F— E) are 


defined by the formula s(F— E)—»;,(— 1Y'c;(E)/ 
S>,(-1)'c;(F). In algebraic intersection theory, eqn 
[10] has the precise meaning that when S;(z) has the 
expected codimension k(k + c — d) in Y (or is empty), 
its cycle class is equal to the given polynomial in 
Chern classes. The Thom-Porteous formula applies 
to the degeneracy loci of arbitrary maps of vector 
bundles E — F. Degeneracy loci constitute an active 
area of research in intersection theory, and there are 
generalizations, for example, to cases where there 
are more bundles or bundle maps with symmetry 
(see Fulton and Pragacz (1998)). 


Moduli Spaces 


The parameter spaces that have appeared often admit 
interpretations as moduli spaces. Moduli problems 
start with geometric objects to be classified, and ask for 
families of these objects over an arbitrary base space to 
be represented as faithfully as possible by maps from 
the base space to some space called a moduli space. For 
enumerative applications it is most useful for the 
moduli space to be compact. One of the principal 
examples is the moduli of algebraic curves of given 
genus g: for g > 2, the moduli space of smooth curves 
M, has a compactification M, by stable curves, as 
defined and studied by Delleae and Mumford. While 
the M, are singular, the singularities are mild enough 
to permit the definition of an intersection theory for 
M, and M,, as was done by Mumford in the 1980s. 
More generally, if X is a complex projective variety, 
Kontsevich's spaces of stable maps M, ,(X, 3) com- 
pactify the moduli of genus g curves with n marked 
points together with algebraic maps to X having image 
in homology class 8 € H5(X). These spaces, and some 
high-powered intersection theory that takes place on 
them, are vitally important in Gromov- Witten theory. 
K-theory also provides an alternative approach to 
intersection products in algebraic geometry. 


Extensions and Related Theories 
Motives and Higher Chow Groups 


Intersection theory has evolved into a mature theory 
with numerous extensions and offshoots. Many of 
these are a result of endeavors to forge links with 
other branches of mathematics. One of the exten- 
sions, higher Chow groups, has its roots in a basic 
property of intersection theory, the excision prop- 
_ erty, which states that if X is a variety and U C X an 
open subvariety, with Z = XU, then the inclusion 
and restriction maps fit into a right exact sequence 


CH,Z — CH,X — CH,U — 0 


This is reminiscent of the long exact homology 
sequence of a pair in algebraic topology. Indeed, 
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there is a corresponding long exact sequence of 
Borel-Moore homology groups, but the elementary 
algebraic theory lacks such a long exact sequence. 
Bloch introduced higher Chow groups in the 1980s 


to fill this gap. The theory, which is quite 
complicated, provides groups CH,(X,j) with 
CH,(X,0) — CH, X, such that there is a long exact 
sequence 

-- — CH,(U, j + 1) ^ CH,(Z, j) ^ CH,(X, j) 


These groups are closely connected to algebraic K- 
theory and also to a related theory called motivic 
cohomology. 

Motives, a sort of universal cohomology theory 
envisaged by Grothendieck, conjecturally form a 
category which can be extended to a bigger category 
of mixed motives that reflects mixed structures in 
cohomology, such as mixed Hodge structures. 
Recently, Voevodsky et al. (2000) have introduced 
motivic cohomology groups which form an integral 
part of a homotopy theory for algebraic varieties. 
Voevodsky’s work, including a proof of the Milnor 
conjecture of K-theory, earned him a Fields Medal 
in 2002. 


Arithmetic Intersection Theory 


There is an arithmetic version of intersection theory 
which applies to an arithmetic scheme X, which is, 
informally, a scheme defined over every prime field 
(all finite fields F, and also Q) in a consistent way. 
This means that X can be base-extended to any 
field. In situations where the complex variety X(C) 
is nonsingular, there is an arithmetic Chow ring 
CH*(X), introduced by Gillet and Soulé in 1990. 
Elements of CH*(X) are equivalence classes of pairs 
(Z,g) where Z is an algebraic cycle on X and g is 
known as a Green current for Z, a current on X(C) 
satisfying the relation 


5 00g T Óz(C) = Ww [1 1] 


for some smooth differential form w satisfying some 
conditions. Here, óz(C) denotes the current of 
integration along Z(C). The point to notice is that 
eqn [11] relates analysis (the Green current) and 
algebra (the cycle) on X on one side with topology 
on the other, as w will be a closed form whose class 
in de Rham cohomology is Poincaré dual to [Z(C)]. 

Arithmetic intersection theory is used to define 
arithmetic height functions. Height functions have 
important applications to Diophantine problems, and 
were an essential component of the proof by Faltings of 
the Mordell conjecture, which earned him a Fields 
Medal in 1986. Arithmetic intersection theory grew 
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out of an earlier theory of Arakelov, in which X(C) is 
endowed with a Kahler metric, and the form w in eqn 
[11] is required to be harmonic. The Arakelov Chow 
group is only a ring when harmonic forms are closed 
under wedge product, which is not the case generally 
but which is true in some interesting cases, for example, 
for Grassmannian varieties. Arakelov treated the case 
of arithmetic surfaces, that is, the case when X(C) is an 
algebraic curve (*surface" refers to a second dimension 
in the arithmetic direction), and introduced a pairing of 
arithmetic divisors, in analogy with the usual pairing of 
divisors on an algebraic surface. Arakelov's work, its 
subsequent generalizations, and more recent develop- 
ments are covered in Faltings (1992). 


Equivariant Theories and Stacks 


Moduli problems such as those mentioned previously 
are often best represented not by traditional varieties, 
but by a more sophisticated sort of object called a 
stack. Taking inspiration from Mumford’s intersec- 
tion theory on M,, intersection theory on algebraic 
stacks has grown into a mature theory in its own 
right. Examples of stacks include orbifolds, for which 
there is the Chen-Ruan (orbifold) cohomology theory 
as well as an algebraic analog due to Abramovich, 
Graber, and Vistoli (see Abramovich, et al. (2002)). 
Another class of examples are quotient stacks of a 
variety by the action of an algebraic group. In these 
cases the Chow groups of the stack are equivariant 
Chow groups, part of a rich theory modeled on 
equivariant cohomology in algebraic topology. 
Behrend (2002) provides a nice survey of stacks, 
equivariant intersection theory, and their uses in 
Gromov- Witten theory. The Bott residue formula is 
an important tool in equivariant intersection theory 


which is particularly well suited to making concrete 
calculations, for example, in enumerative geometry. 
A description with nice examples can be found in 
Ellingsrud and Strømme (1996). 


See also: Cohomology Theories; Hamiltonian Group 
Actions; Index Theorems; K-Theory; Moduli Spaces: 
An Introduction. 
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Formulation of the Problem 


Consider the Newton equation 


x = F(x), F(x)--WVw(x,  xeR^ [i] 
where 
y € C'(R^,R) 
jalul) € e (1 + |x|) ^" i2] 


for x € RÊ, || € 2, and some a > 1, c) > 0 


(where j is the multi-index j € (NU {0})%,|j|= 
em jn). In classical mechanics, eqn [1] describes 
the dynamics of a particle with the mass m = 1 in the 
force field F with the potential v. For eqn [1] the 
energy E =(1/2)(x(t))* + v(x(t) is an integral of 
motion. 

Under the assumptions [2], it follows that (Reed 
and Simon 1979): for any (p_,x_) € R74,p_ 40, 
eqn [1] has a unique solution x € C?(R, RŽ) such 
that 


x(t) = p_t +x + y-(£) 


y-(t) 0, y. (t) ^ 6, 5 


a8 c-*DO 


in addition, for almost any (p ,x ) 


x(t) = a(p_,x_)t+ b(p_,x_)+y.(t) 
a(p_,x-) #0, y+(t) + 0, y, (t) > 0 [4] 


as t — +00 


furthermore, the set D of all (p ,x ) € R”, p_ 40, 
for which [4] holds for fixed v, is an open subset of 
R74 and Mes(R74\D) — 0. 

We say that a, b arising in [4] (and defined on D) are 
the scattering data for eqn [1]. In addition, the scattering 
data a, b at fixed energy E > 0 meansa, bon ((p ,x ) € 
D | p* /2 = E). Roughly speaking, for a particle moving 
according to [1], the functions a, b relate the free motion 
at time £ — —oo with the free motion at time t — +00. 

Note that 


aD- N- t top-) = a(p-. x.) 
b(p_,x_ + top_) = b(p_,x_)+toa(p_,x_) [5] 
(p_.x-)ED, tER 


Formula [5] imply that a,b on D are uniquely 
determined by a,b on {(p_,x_)€ D|p x. =0}, 
where p. x. is the scalar product of p... and x_. 

If v(x)z0, then a(p_,x_)=p_,b(p_,x_)= 
x_,(p_,x_) € Rp. #0. Therefore, it is convenient 
to use for a, b the following representation: 


a(p_,x_) = p- + asc(p_,x_) 


b(p-,x-)—x--b«(p-,x-) (p-,x-)€D [6] 


where the subscript sc is an abbreviation of the word 
“scattering.” 

The direct scattering problem for eqn [1], under 
the assumptions [2], consists in the following: given 
v, find a, b. 

The inverse-scattering problem for eqn [1], under 
the assumptions [2], consists in the following: given 
a,b (or some partial information about a, b), find v. 

In the present article, we discuss, mainly, the 
aforementioned inverse-scattering problem. 


Abel's Result of 1826 


Consider the Newton equation [1] in dimension 
d—1forxc]|-—oo,x1], x1 > 0, where 
v € C^(] — oo, xi], R) 
v(x)=0 forx <0 7] 
dv(x) 
dx 


Under the assumptions [7], for any p- > 0, where 
E=p* /2 <v(x;), eqn [1] has a unique solution 
x € C*(R,] — oo, x1]) such that 


s(t) —p.t fort<0 [8] 


»0 for0«xe«x 
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in addition, 


x(t) = —p t-- b(p ) ast +00 [9] 
Let 
T(E) — b(v2E) 0<E<v(x;), V2E»0 [10] 


V2E 


(T(E) is the time during which a particle starting 
at x=0 with the impulse p. — v2E returns to 
Ui. 

Let x(v), v € [0, v(x1)], be the inverse function to 
v(x), x € [0, x1]. Then (under the assumptions [7]), 


T(E) = V2 "(E = pA) gy 
0 dv 


0 « E « v(x1) [11] 


^ DU 


EM NET 
xe} = xl (v— E) '"*T(E)dE 
0 «v « v(x) [12] 


Actually, the formulas [11], [12] relating the travel 
time T and the potential v are the results from 
Abel (1826) (see also Keller (1976) for a discus- 
sion of this result). Formula [11] is a result on 
direct scattering, whereas [12] is a result on 
inverse scattering. In addition, if T(E),0 < E < 
v(xi), 1s given, then [11] is the Abel integral 
equation for x(v),0 <v < v(xi), and [12] solves 
this equation. 

Concerning further results on inverse scattering 
for the one-dimensional Newton equation, see Keller 
(1976) and Astaburuaga et al. (1991). Note that for 
the one-dimensional case the scattering data a, b do 
not in general determine v uniquely. 

The Abel integral equation and the Abel 
formula solving this equation were used also, in 
particular, by Firsov (1953) and Keller et al. 
(1956), where inverse scattering was considered 
for the three-dimensional Newton equation at 
fixed energy for the case of spherically symmetric 
monotonous decreasing potential in |x|. 

Note also that the Abel method for solving the 
integral equation [11] was used by Radon (1917) 
for finding the inversion formula for the Radon 
transformation. In the next section, we reduce the 
inverse-scattering problem for the Newton equa- 
tion [1] in dimension d > 2, under the assumptions 
[2], to the inversion problem for the X-ray 
transformation (i.e., the Radon transformation 
along straight lines). 
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Inverse Scattering for the 
Multidimensional Newton Equation 


Consider 
TS*! = {(6,x) e St! x R4| Ox = 0} [13] 


Consider the X-ray transformation P defined by the 
formula 


PFO a) = | feo xt (0,x) € TSE! [14] 
JR 
where 
f c QIRAR”) 
f(x) = O(lx| ^) as |x| + oo for some 81 [15] 


Consider the functions ásc, bs of [6] 


Theorem 1 (Novikov 1999). For the Newton 
equation [1], under the assumptions |2], the follow- 
ing formulas hold: 


PRO. x) = lim Sase(s0, x), 


smo 


(0,x)e TS! [16] 


Pu(6,x) = lim s*0b,.(s0,x), (0,x)e TS^ ! [17] 


in addition, 


|PF(0, x) — sas (s0, x)| 
d3 23244 3 


afar D VE GIA 
|Pv(0, x) — s*Ob,.(s0, x)| 
q3 232044 E 49 


C A 
~ ala — 1)*(1 + |x|/ V2) (s/ V2 — 1) 


for (0,x) € TARI ss z(d, c, o, |x|), where Obs is the 
scalar product of 0 and bsc, z is the root of the equation 


d? c2? z2 
(a —1)0 + [xl/ V2) (/v2 -1 
z €)v2, --oo| [20] 


c= max (c1,c2) (and a, c1, c2 are the constants of |2.]). 


Theorem 1 gives a method for finding PF and Pv 
from ds. and bs at high energies. It has been proved in 
Novikov (1999) by means of analysis of the following 
nonlinear integral equation for the function y_ of [3]: 


y- (t) = Ap E.” (y-)(t) 


a=] ]. F(p_s+x_ + u(s))ds dr 


In dimension d > 2, Theorem 1 and methods for 
the reconstruction of f from Pf (Gelfand et al. 1980, 
Natterer 1986, Novikov 1999) give a method for the 
reconstruction of F and v from the scattering data a, 
b at high energies. Note that for d=1 Theorem 1 
is valid but f cannot be uniquely reconstructed 
from Pf. 

Theorem 1 is an analog of the Born formula for 
the Schrödinger equation at high energies (see, e.g., 
Faddeev (1956), Enss and Weber (1995), and 
Novikov (1998) as regards this Born formula and 
its variations). On the other hand, Theorem 1 was 
preceded by a result of Gerver and Nadirashvili 
(1983) on the high-energy asymptotics for the travel 
time between boundary points for the Newton 
equation in a bounded strictly convex domain with 
smooth boundary. There is a considerable similarity 
between this result and Theorem 1. 

We continue our review on inverse scattering for 
the multidimensional Newton equation, and make 
the following well-known observation. 


Observation 1 Suppose that v(x) > E > 0forx € U, 
where U is a compact subset of R^. Then the scattering 
data a, b for energies smaller than or equal to E contain 
no information about v(x) for x € U. 


In addition to Theorem 1 and Observation 1, one 
has the following conjecture. 


Conjecture 1 (Novikov 1999). Suppose that v 
satisfies |2], d > 2, and the energy E is sufficiently 
large, E> E(v). Then the scattering data a,b at 
fixed energy E uniquely determine v. 


Gerver and Nadirashvili (1983) proved a result 
similar to Conjecture 1 for the case of the Newton 
equation in a bounded strictly convex domain G 
with smooth boundary. Their proof of this result 
contains no reconstruction method but does contain 
a stability estimate. It is based on the Maupertuis 
principle and the results of Muhometov and Roma- 
nov (1978), Beylkin (1979), and Bernstein and 
Gerver (1980). For the case v € C?(IR“,R), suppv C 
G (where G has the properties mentioned above), 
in Novikov (1999) a connection between the 
boundary-value data of Gerver and Nadirashvili 
(1983) and the scattering data a,b is given and it is 
shown that for d > 2 the scattering data a, b and the 
domain G uniquely determine v at fixed sufficiently 
large energy E > E(v, G). 

For more information concerning results men- 
tioned above, see Novikov (1999) and Gerver and 
Nadirashvili (1983). One can see from the review 
of this section that very few results on inverse 
scattering for the multidimensional Newton equa- 
tion are given in the literature, at present. It should 


be remarked that the inverse-scattering theory in 
multidimensions is much more developed for the 
Schródinger equation than for the Newton 
equation. 


Inverse Scattering for the Schrodinger 
Equation in Multidimensions 


The inverse-scattering theory for the multidimen- 
sional Schródinger equation has been developed by 
many authors (see, e.g., surveys given in Grinevich 
(2000) and Novikov (2001)). 

Quantum-mechanical analogs of Theorem 1 
appear, for example, in Faddeev (1956), Enss 
and Weder (1995), Novikov (1998) (see also 
references therein). Similarly, the quantum-mechan- 
ical analogs of Conjecture 1 have been proved, for 
example, in Novikov (1992, 1994) and Grinevich 
and Novikov (1995) (see also references therein). On 
the other hand, as a rule, classical-mechanical analogs 
of results of the works on inverse Schródinger 
scattering in multidimensions are unknown. This 
leads to many open problems. For the one-dimen- 
sional case some results on finding classical limits of 
results on inverse Schródinger scattering are given in 
Lax and Levermore (1983) and Bogdanov (1985). 
Note that inverse scattering for the two-dimensional 
Schródinger equation at fixed energy (see Novikov 
(1992), Grinevich and Novikov (1995), and 
Grinevich (2000) and references therein) has con- 
siderable similarity with inverse scattering for the 
one-dimensional Schrödinger equation. Therefore, 
an interesting open problem consists in extending 
the aforementioned study of Lax and Levermore 
(1983) and Bogdanov (1985) to the case of inverse 
scattering for the two-dimensional Schródinger 
equation at fixed energy. Perhaps, in this way one 
can find proper two-dimensional analogs of the Abel 
formulas [11] and [12]. 
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Introduction 


The equations governing the motion of an ideal 
(inviscid) fluid were derived by Euler in 1755. They 
were, together with the equation of vibrating strings, 
the first partial differential equations introduced in the 
field of mathematical physics. While several partial 
differential equations, coming from the modeling of 
physical phenomena, have had a satisfactory mathe- 
matical solution, it is piquant to note that the old Euler 
equations remain essentially unsolved. Together with 
the Navier-Stokes equations of viscous fluids, the 
Euler equations play a central role in the modern 
analysis of partial differential equations. 

The mathematical difficulties encountered in the 
study of Euler equations seem to be deeply linked with 
the understanding of turbulence, which remains one of 
the great open problems in the field of macroscopic 
physics. 

The relevance of Euler equations as a model of 
fluid flow is rather subtle, and the discussion is far 
from closed. On the one hand, Euler equations have 
disturbing aspects, which, in their most visible form, 
yield paradoxes. On the other hand, the systematic 
recourse to some viscosity seems to put a serious 
obstacle to a proper understanding of turbulence. In 
this article we will try to give some insight into this 
Issue. 

To be rigorous, every fluid has some compressi- 
bility, that is to say the density varies with the 
pressure. Compressibility gives rise to pressure 
waves, which propagate in the fluid with some finite 
speed. When the velocity of the fluid particles is 
slow relative to the speed of the pressure waves, it 1s 
legitimate to make the approximation that the flow 
is incompressible; it is the case for meteorological 


flows, for example. Then, there are no more 
pressure waves; nevertheless the motion can be 
very unstable and intricate (turbulent). Although 
very often in physical flows these two features 
coexist, following the tradition, we clearly separate 
the compressible and incompressible cases. 


The Equations of the Perfect Fluid 


Until now a rigorous derivation of the fluid 
equations from a system of interacting particles 
governed by Newton's laws is not known. Thus, 
the mathematical models of fluid motion result 
from heuristic considerations. 

Let us specify some notations. 

The fluid motion is supposed to take place in 
some domain (not necessarily bounded) 2 of the 
physical space R. 

We shall use the so-called Eulerian description of 
the fluid motion: p(£, x) denotes the local density of 
the fluid at time £ and position x, and u(t,x) the 
velocity of the fluid particle located at x at time f. 

The first equation (conservation equation) expresses 
the conservation of mass: 

“Ps div(pm) zi) [1] 

The second equation (momentum equation) 
expresses Newton’s law (in the absence of internal 
friction): 


(s (u - viu) = —Vp [2] 


where the scalar function p(t, x) is the pressure 


inside the fluid, and 
(u V)u = » uu 


With [1] and [2], we have five scalar unknown 
functions (p, u;, p) and only four equations. To get a 


closed set of equations, we need to add a supple- 
mentary relationship: 


div(u) — 0, for the incompressible flows [3] 


In the case of compressible flows, eqns [1] and [2] must 
be completed by a thermodynamical description of the 
fluid, which yields a relationship between p,p, the 
internal energy, the specific entropy, etc. We will only 
consider here the simple case of an isentropic gas 
which is modeled by the relationship 


p = p(p) [4] 
with p(p)=cp” for a perfect gas (c > 0,7 > 1). 
Condition at the Boundary 0Q of the Domain 


In the case of a perfect fluid, we simply have to 
write that the velocities of the fluid particles at the 
boundary are tangent to the boundary, that is, 


u:n—0 on OQ [5] 


where n denotes the unit normal vector to the 
boundary (pointing outward). 


The Incompressible Perfect Fluid: Main 
Properties of Smooth Flows 


We shall suppose p— 1. Equations [1]-[3] and [5] 
then yield the classical Euler system: 


Ou 
3; Wo Ve = -Vp on? [6] 
divi —0. u-n=0 on dQ 


The Constants of the Motion 


Let us examine the constants of the motion of the 
dynamical system defined by [6], that is, the functionals 
which are conserved by the motion of the fluid. 

First we have the classical constants of motion 
associated with the natural symmetries by Noether's 
theorem. 

The time translational invariance of the system 
implies that the kinetic energy is conserved: 


1 f 54 
Ec — ;] u dx 
2 () 


In the case Q = R?, the homogeneity of space implies 
the conservation of the impulsion: 


f wax 
Q 


The space isotropy, on the other hand, yields the 
conservation of the angular momentum: 


] xauax 
Q 
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There is a more hidden constant of the motion, 
called helicity, which was discovered in 1961 by 
J ] Moreau (1961) (see, e.g., Serre (19797). 

Let us define the vorticity of the flow: 


w = curl u 


f ouda 
Q 


Of course, here, we suppose 4 to be vanishing at 
infinity in such a manner that the above integrals 
make sense. 

One may wonder about the existence of other 
constants of the motion of the form (first-order 
functionals): 


then the helicity is 


j F(x, u(x), Vu(x))dx 


The answer, due to Serre (1979), is that any 
functional of the above form which is conserved by 
the flow is a linear function of the energy, the 
impulsion, the angular momentum, the helicity plus 
a trivial term (i.e., taking the same value for any 
field u such that div u = 0). 


Beltrami Equation and Kelvin's Theorem 


Another important issue is to know how the vorticity 
field evolves in a regular flow. If we apply the operator 
curl to the equation [6] in order to eliminate the 
pressure term, we get: 


— + (u- V)u — (w- V)u — 0 [7] 


which is the Beltrami equation. 
To exploit the Beltrami equation, we need the 
Lagrangian flow (t,x), associated with the field u, 

which is defined by the differential equation: 
Op 


(t,x) = u(t, p(t, x), 


p(0,x) =x 

Then we can state the following proposition. 
Proposition During the smooth motion of an 
incompressible perfect fluid, we have: 


w(t, p(t,x)) = Dy(t,x)|w(0,x)], for all t,x 


where Dy(t,x) denotes the derivative at the point x 
(t fixed) of the mapping x — (p(t, x). 


The first consequence of this result is to point 
out the class of irrotational flows, for which 
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w(t,x)=0. Indeed, if the vorticity vanishes initi- 
ally, it follows from the proposition that it will 
vanish for ever. 

Another consequence is the behavior of vortex 
lines. By definition, a vortex line is any integral 
curve of the vorticity field. More precisely, a 
vorticity line at time £, C(s) is defined by the 
differential equation 


Now we can check that vortex lines are merely 
transported by the flow: if C(s) is a vortex line at 
time t=0, y(t, C(s)) is a vortex line at time f. 

We end this section with the famous Kelvin’s 
circulation theorem (1869) (see, e.g., Marchioro and 
Pulvirenti (1994)). 


Theorem Let L be a closed (oriented) contour drawn 
inside the fluid. We suppose that L is transported by 
the flow; p,( L) denotes the contour at time t. Then the 
circulation of the velocity field u(t,x) along q,(L) is 
independent of t. 


Stationary Solutions: D'Alembert's Paradox 


Let us focus now on the flow around a bounded 
body Q, whose complement Q“ will be supposed to 
be simply connected. 

A stationary solution u(x), p(x) satisfies: 


(u-V)u=—Vp 
divu=0, u-n=0 ön gQ 


But since (u-V)u=V(5u") +(curlu) Au, any 
stationary field u(x) satisfying curl 4— 0, div u — 0, 
4:n-—0 on OQ, defines a stationary solution with 
associated pressure p — — 3 w^. 

We also need to specify a condition at infinity 
for the field u. We impose that the velocity is equal 
(at infinity) to some constant value U. Since Q“ is 
simply connected, the condition curl 4 — 0 implies 
that the flow is potential, that is, there is a scalar 
function F(x) such that u = U + VF. 

Thus, the determination of an irrotational flow 
around an obstacle amounts to solving the following 
exterior Neuman problem. 


Find F satisfying: 


AF=0 inQ* 
SCAPE T on OX) 
On 


VF =0_ at infinity 


This problem is well known and has a unique 
solution, which satisfies, at infinity: 


F(x) = O(1/|x|?) — VF(x) = O(1/|x[)) 


Then a classical calculation (integration by parts) gives 
the resulting force exerted by the flow on the body: 


1 
R=- | pudo = | -u ndo = 0 
an on 2 


This property of inviscid potential flows was first 
noticed by Jean Le Rond d'Alembert (1717-1783). 
Furthermore, d'Alembert performed a series of 
experiments to measure the drag on a sphere in a 
flowing fluid and he expected that the force would 
go to zero as the viscosity of the fluid approached 
zero. But this was not the case: the drag seemed 
to converge toward a nonzero value. Hence, this 
property was called d'Alembert's paradox. 

Of course, d'Alembert's paradox tells us that some- 
thing is going wrong: this model of flow around a body 
is not physically relevant. But it is not obvious to 
identify precisely what is going wrong. 

Physics tells us that in a flow around a flying 
airplane, the viscous term (as measured by an 
dimensionless number called Reynolds number) is 
very small. The main effect of the viscosity is then 
to alter the limit condition at the boundary of the 
body. The relevant boundary condition is no longer 
u-n=0, but the purely viscous condition u=0, 
or more realistically a condition of friction type 
(turbulent boundary condition). 

A common approach is to disqualify the perfect- 
fluid model in arguing that this modification of the 
boundary condition has important consequences on 
the flow near the body (giving rise to a turbulent 
boundary layer, for example). 

It seems to us that such a disqualification of the 
perfect-fluid model discards prematurely interesting 
issues. Indeed, we must notice first that the 
stationary solution on which d'Alembert's reasoning 
is based is highly unstable and not acceptable 
physically. Thus, a realistic solution would necessa- 
rily be either nonstationary or with some vorticity. 
On this basis, we can imagine other scenarios to 
explain the existence of a resulting force exerted 
on the body. For example, we may imagine a 
stationary solution with a discontinuous velocity 
field (i.e. with a vortex sheet). The process 
conducive to such a stationary solution is called 
Prandtl's scenario (Batchelor 1967). The mathema- 
tical proof that Prandtl's scenario does exist is a 
difficult (open) issue, which seems closely related to 
the (probable) nonuniqueness of weak solutions of 
the Cauchy problem. 


The Cauchy Problem for the 
Incompressible Perfect Fluid 


The Case Q c R? 


In the Cauchy problem, given an initial velocity field 
uo(x), we want to determine the corresponding 
solution z(t,x) of [6] at each time t. 

The first significant result on the Cauchy problem 
for three-dimensional Euler equations was given by 
Kato (1975). 


Theorem For uo in the Sobolev space H*(9?), for 
s > 5/2, there is T > 0 and a unique classical solution 
(of the Cauchy problem) u(t,x) on [0, T] x R . u 
depends continuously on t in the space H*. 


By a classical solution we mean that the field 
u(t,x) is derivable in terms of the variables £, x and 
satisfies the equations in the usual sense. 

Here H? (X?) denotes the Sobolev space of the 
fields u, which are square integrable and with spatial 
derivatives of order s (in the case where s is an 
integer) also square integrable. 


Remark These results have been generalized to some 
extent during the last few decades, but the following 
issues are still open: 


1. Do singularities occur at a finite time for such 
regular solutions? 

2. For a less regular initial datum, do weak solutions 
exist (in the sense of distributions)? 


The Case Q c R? 


This case is better understood, the first mathematical 
results trace back to Lichtenstein (1925) and Wolibner 
(1933); they take a plain form with the famous theorem 
of Youdovitch 1963 (see, e.g., Chemin (1995)). 

In two dimensions, the vorticity w = curl u identifies 
with a scalar function, and the Beltrami equation 
becomes | 


Ow ,. 
E" + div(ww) = 0 [8a] 
curl u = w [8b] 
divu=0, u-n=0 on OQ [8c] 


This formulation, which appears as a transport 
equation [8a] for w, coupled with the elliptic system 
[8b]-[8c], which determines u from w, is particularly 
convenient. 

The constants of motion associated with the usual 
symmetries, of course, persist; notice, however, that 
the helicity degenerates since, in two dimensions, 
w-u=0. But now from [8a] we see that w is merely 
convected by the incompressible velocity field u. We 
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deduce that, for any continuous function f, the 
functional 


f f (ut, x))dx 


is a constant of motion. 

Thus, a specific feature of the two-dimensional case 
is to introduce an infinite set of constants of motion. 
By a skilful exploitation of this fact, Youdovitch 
succeeded in proving the following result. 


Theorem For a given wọ in the space L™(Q), there 
is a unique weak solution w(t,x) of |8], such that 
w(t,x) is in L*(Q) for all t, and w depends 
continuously on t in the space L?,1 < p < oc. 


L? denotes, in a standard way, the Lebesgue space 
of the functions f such that |f|? is integrable over 
Q and L*(Q), the space of measurable bounded 
functions on €). 

Thus, if we limit ourselves to initial data with 
bounded scalar vorticity, the Cauchy problem for 
the two-dimensional incompressible perfect fluid is 
satisfactorily solved. The situation is much more 
intricate if we consider a less regular initial datum 
(e.g., if wo is a measure supported by a curve (vortex 
sheet)). 


Arnol'd's Work on Two-Dimensional Inviscid Flows 


Youdovitch's theorem implies that the incompressible 
Euler equations, with wo in L*(Q), is a satisfactory 
model of two-dimensional flows — an important issue 
to study further the properties of this model. 

A famous result due to Arnol'd (see Arnol'd and 
Khesin (1998) and Marchioro and Pulvirenti (1994)) 
deals with the nonlinear stability of the stationary 
solutions. 

Let us determine the smooth stationary solutions 
of the two-dimensional Euler equations in a bounded 
domain 2 of the plane. We have to solve: 


(u-V)w=0 [9a] 
curlu = w [9b] 
divu=0, u-n=0 onOQ [9c] 


Since we have div u = 0, we may introduce the stream 
function of u, v, which is given by the Dirichlet’s 
problem: 


-Ayp =w, *v-—0onOXQ 


so that u = curl v. 
The system [9] becomes: 


VyYyAVw=0, —Av=w, v-—00onO9 
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Let us focus on solutions which are characterized by a 
relationship w= f(Ņ), where f is a smooth function. 
Such solutions are given by the resolution of the 
following nonlinear elliptic problem: 


—Awv = f(y), w=0 on dN [10] 


This problem has always at least a solution, for 
example, if f is a bounded function of v. 

Let $* be a solution of [10], and w*=f(w*) 
the corresponding vorticity function. We shall say that 
the stationary solution w* is stable in the L*-norm if: 


For all £ > 0, there is a 6 > 0, such that for all initial 
datum wọ in L* (€) satisfying 


[w —wo)*dx € 6, we have : 
Q 
um —w(t)) dx < e, for all t 
Jo 


where w(t) denotes the solution of the Cauchy problem 
associated with the initial datum wọ by Youdovitch's 
theorem. 


Now we can state the following result. 


Theorem (Arnol'd) Let w be a stationary solution 
given by |10]. We assume tbat one of tbe following 
assumptions bolds: 


(C1) There are positive constants c1,c», such that 
alf Xe 


(C2) There are positive constants c1, c2, with cr < A 
(first eigenvalue of the Dirichlet problem on tbe 
domain Q) such that: 


a<-—fi'<oa 


Then w is stable in the Ly-norm. 


Remarks 


(i) This result was the first nonlinear stability result 
for stationary flows. 

(ii) The proof makes use of the conservation of the 
functionals of the vorticity field. 


Another significant contribution of Arnol’d to 
hydrodynamics was to reveal the geometrical aspect 
of the instability of the perfect-fluid motion. We give 
a brief insight into this issue. 

Let us come back to the Lagrangian description 
of motion. We want to determine the function 
y(t,x). Each mapping y,;(x)=y(t,x) is, for t fixed, 
a diffeomorphism of Q preserving the Lebesgue 
measure and the orientation (equivalently stated, it 
is an element of SDiff(Q)). 


In other words, a fluid motion is a curve t — y, 
drawn on the *manifold" M — SDiff(Q) (the config- 
uration space of the system). 

At time £, the relationship 


Op 

Ot 
states that the velocity field u(t, y;(x)) belongs to the 
space tangent to M at y;. The tangent space at y 
to M is the space of vector fields v(o(x)), where v(x) 
is an incompressible vector field on Q satisfying 
v:n—0 on OQ. This space is naturally endowed 
with a norm given by the kinetic energy 


E 2 
— | v(x) dx 
zh, e) 


(t, x) = u(t, p(t, x)) 


and thus M is endowed with a Riemannian 
structure. 

It is easy to check that the perfect-fluid motions 
correspond to the curves y; drawn on M which are 
the critical points of the action integral: 


1 ie ; 
= dt | 
2 t 0 


(with the constraints v(ti,.) = 91, y(t, .) = v) 


= j 


2 


f (t x) dx, forall t; < tz 


That is to say, the perfect-fluid motions are the 
geodesics of the Riemannian manifold M. 

The main interest of this geometric framework is 
to bring back, at least formally, the perfect-fluid 
motions to well-known objects. Indeed, we know 
that the Riemannian curvature of a manifold has a 
profound impact on the behavior of geodesics on it. 
If the Riemannian curvature is positive, then nearby 
geodesics oscillate about one another, and if the 
curvature is negative, geodesics rapidly diverge from 
one another. More precisely, the stability of geode- 
sics is expressed in terms of the curvature by means 
of Jacobi’s equation [1]. If y, is a geodesic curve 
starting from 4o, with velocity field v(t) (whose 
norm is supposed equal to 1), if the sectional 
curvature of the manifold in all the 2-planes 
containing v(t) is less than —c(< 0), a perturbation 
of the initial datum will increase at least as exp(ct): 


d (pr, Pr) = d(vo. Go) exp(ct) 


where o denotes the perturbed initial datum and d 
the geodesic distance on the manifold. Moreover, if 
the curvature at every point and for all the sections 
is less than —c, and if M is compact, then the geodesic 
flow, that is, the- one-parameter group of transfor- 
mations (po, v(0)) — (p, v(t)), is mixing (in the usual 
meaning of ergodic theory). Arnol'd succeeded in 
calculating the sectional curvature for flows on the 
two-dimensional torus; he showed that the 


curvature is negative for *most" of the sections. This 
gives an enlightening geometrical picture of the 
instability of Lagrangian flows. 

It was tempting to connect the above considera- 
tions on the instability of two-dimensional flows 
with the problem of weather forecast. In 1963 
EN Lorenz stated that a two-week forecast would be 
a theoretical bound for predicting the atmospheric 
motion. Lorenz's assertion was based on numerical 
simulations. He took as model for the large-scale 
atmospheric motion the two-dimensional Euler 
equations on the torus, which he truncated to a 
small number of Fourier modes (about 20). This 
model is highly unstable and displays exponential 
sensitivity with respect to the initial datum. How- 
ever, the parallel between the behavior of this 
system and the instability of the Lagrangian flow is 
misleading. On the one hand, if we again do the 
Lorenz computations on Euler equations, taking 
into account a large number of Fourier modes, we 
note a striking phenomenon: the flow has a tendency 
to self-organize into large vortices, called coherent 
structures, and simultaneously the exponential 
sensitivity, as measured in terms of the energy 
norm of the velocity field, disappears. On the other 
hand, the problem of predicting the Lagrangian flow 
is very different, the Lagrangian flow can be 
exponentially unstable, while the corresponding 
velocity field quietly converges, in the energy norm, 
towards some equilibrium. We must keep in mind 
that the meteorologist aims to predict the values of 
the velocity field at some future time and not the 
trajectories of the fluid particles. In fact, it appears 
that Lorenz has ignored phenomena of a statistical 
nature which occur when a large number of degrees 
of freedom are considered; thus, his theoretical 
bound for the prediction of the atmospheric motion 
has no definite basis. More detailed reflections on 
this issue can be found in Robert and Rosier (2001). 


The Cauchy Problem for the Euler 
Equations for Compressible Inviscid 
Fluids 


As remarked in the introduction, compressible flows 
yield pressure waves. The equations of motion being 
nonlinear, these waves interact in an intricate 
manner giving rise to shocks. This is the main 
feature of compressible fluid flows. Compressible 
flows are situated in the more general domain of 
nonlinear hyperbolic systems, which were inten- 
sively studied during the last decades. We only give 
here an example of the kind of result which can be 
obtained. 
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The following theorem, which states that for a set 
of regular initial data, shocks do not occur till some 
finite time, is a consequence of a more general result 
on hyperbolic systems due to Majda (1984). 

We consider Q — 9? and the system [1], [2], [4]. 


Theorem Assume  po,uo € HSAL®(R?), with 
s» 5/2 and po(x) » 0. Then there is a finite time 
T > 0, depending on the H^ and L* norms of the 
initial data, such that the Cauchy problem for [1], 
[2], [4] bas a unique bounded smootb solution p, 
u € C'([0, T] x $9), with p(t,x) > 0 for all t,x. 


Inviscid Flows and Turbulence 


Loosely speaking, turbulence is the intricate motion 
of a slightly viscous flow. Going back to the first 
half of the last century, there are two main 
approaches to turbulence. The first is due to Leray. 
The dissipation of energy is a characteristic feature 
of three-dimensional turbulence, and Leray thought 
that, even if very small, the viscosity of the fluid 
plays an important role, so that to understand 
turbulence the first step is to study the Navier- 
Stokes equations. A radically different approach is 
due to Onsager. Onsager (1949) started with the 
fundamental remark that the 4/5 law of turbulence, 
which relates the dissipation of energy to the 
increments of the velocity field, does not involve 
viscosity. Furthermore, he observed that the proof of 
the conservation of energy for the solutions of Euler 
equations uses an integration by parts which 
supposes some regularity of the velocity field. He 
then imagined that an inviscid dissipation mechan- 
ism, due to a lack of regularity of the solutions, was 
at work in Euler equations. In modern terminology, 
he suggested to model turbulent flows by nonregular 
(weak) solutions satisfying the Euler equations in the 
sense of distributions. He also conjectured that if a 
solution satisfies a Holder regularity condition of 
order >1/3, then the energy would be conserved. 
Onsager's views were revolutionary and forgotten 
for a long time. Recent works, such as the proof of 
Onsager's conjecture, the construction of weak 
solutions with energy dissipation, and the discovery 
of the explicit local form of the energy dissipation 
for weak solutions, show a renewed interest in these 
views (see, e.g., Constantin and Titi (1994), Eyink 
(1994), Robert (2003), and Shnirelman (2003)). 


See also: Compressible flows: Mathematical Theory; 
Dissipative Dynamical Systems of Infinite Dimension; 
Hyperbolic Dynamical Systems; Incompressible Euler 
Equations: Mathematical Theory; Non-Newtonian Fluids; 
Partial Differential Equations: Some Examples; Chaos 
and Attractors; Turbulence Theories. 
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Introduction 


This paper reviews recent developments, following 
closely (sometimes verbatim) the review paper 
Calogero F (2004c) (see the Bibliography below); 
for more traditional investigations of isochronous 
systems see other entries of this Encyclopedia (and 
for the mathematical investigation of isochronous 
centers in the plane, related to the 16th Hlilbert 
problem, see for instance the survey paper referred 
to at the end of this entry). 

The isocbronous systems treated herein are char- 
acterized by the property to possess an open domain 
having full dimensionality in their phase space such 
that all the motions evolving from a set of initial 
data in it are completely periodic with the same 
fixed period. The natural measure of this open 
domain might, or it might not, be infinite when the 
measure of the entire phase space is itself infinite: 
for instance, if the entire phase space is the two- 
dimensional Euclidian plane, such a domain might 


be the exterior, or the interior, of a circle of finite 
radius. 

It is justified to call such systems superintegrable, 
or perhaps partially superintegrable inasmuch as the 
property of isochronicity of all their motions holds 
only in a subregion of the entire phase space. This 
terminology is justified by the observation that, 
roughly speaking, all confined motions of a super- 
integrable system — in which all but one of the 
degrees of freedom are constrained by the existence 
of the maximal possible number of constants of 
motion — are completely periodic, although not 
necessarily all with a fixed period — entailing that 
isochronicity entails superintegrability, while the 
converse is not the case (see the entry Integrable 
systems in this Encyclopedia). 

A simple trick — amounting essentially to a 
change of independent, and possible as well of 
dependent, variables, allows to deform a largely 
arbitrary dynamical system so that the deformed 
system obtained from it be isochronous. This 
“trick”, which is now explained, entails therefore 
that isochronous systems are not rare. Below we 
provide several examples; others can be found in 
the further reading suggested at the end of this 
entry, and/or can be manufactured ad libitum using 
the trick. 


The Trick 


We now show that, given a largely arbitrary 
dynamical system, it is possible to introduce a 
deformed version of it featuring a real constant w, 
that has the following properties: for w=0, it 
coincides with the original, undeformed system; for 
w » 0, it possesses an open region having full 
dimensionality in its phase space such that all 
solutions evolving from an initial datum in it are 
completely periodic with a period T which is a finite 
integer multiple, or perhaps a simple fraction, of the 
basic period 
T=" ] 
Ww 
Let us indeed, consider a quite general dynamical 
system which we write as follows: 


C = Ber) n 


Here €=C(r) is the dependent variable, which 
might be a scalar, a vector, a tensor, a matrix, you 
name it. The independent variable is 7, and the main 
limitation on the dynamical system [2] is that it be 
permissible to treat this variable as complex; this 
requires that the derivative with respect to this 
complex variable 7 that appears in the left-hand side 
of the evolution equation [2] make sense, namely 
that this dynamical system be analytic, entailing that 
the dependent variable ¢ be an analytic function of 
the complex variable 7 (but this does not require 
C(r) to be a holomorphic nor a meromorphic 
function of 7;¢(7) might feature all sorts of 
singularities, including branch points, in the com- 
plex 7-plane, indeed this will generally happen since 
we generally assume the evolution equation (??) to 
be nonlinear). The quantity F in the right-hand side 
of [2] - which has of course the same scalar, vector, 
matrix... character as C = might depend (arbitrarily 
but analytically) on ¢ as well as on 7. (Let us also 
emphasize that this approach is as well applicable to 
more general dynamical systems that also feature 
other, “spacelike”, independent variables, for 
instance are a system of PDEs rather than ODEs; 
the interested reader is referred to the literature cited 
below). 

In spite of the generality of this dynamical system, 
[2], there generally holds a result (“Theorem of 
existence, uniqueness and analyticity") that charac- 
terizes the solution ¢(r) of its initial-value problem 
determined by the assignment 


G(0) = Go 


Here, for notational simplicity, we assign the initial 
datum (o at 7 — 0; and we assume of course that the 
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right-hand side of [2] is not singular for 7 — 0 and 
C — (o. The relevant result guarantees, not only for 
the initial datum (o, but for a (sufficiently small but 
open) set of initial data in its neighborhood, the 
existence of a circular disk in the complex 7-plane, 
centered at 7 — 0 (where the initial data are assigned) 
and having a nonvanishing radius p, such that the 
solutions (T) corresponding to these initial data are 
holomorphic in it, namely for |7| « p (and note that 
if ¢(r) is a multicomponent object, the property to 
be holomorphic is featured by each and everyone of 
its components). 

Let us now introduce the following changes of 
dependent and independent variables: 


z(t) = exp(iAwt)C(T) [3a] 
rane? [3b] 


This transformation is called “the trick”. The 
essential part of it is the change of independent 
variable [3b]: and let us re-emphasize that, here and 
hereafter, the new independent variable t is con- 
sidered as the real, “physical time” variable. Note 
that [3b] entails 


TO= 0, A0) 1 


and, most importantly, that T7(f) is a periodic 
function of £ with period T, see [1]. More specifi- 
cally, as the time £ increases from zero onwards, the 
complex variable 7 travels counterclockwise round 
and round on the circle C the diameter of which, of 
length 2/w, lies on the imaginary axis in the complex 
T-plane, with one extreme at the origin, 7 — 0, and 
the other at the point 7 = 2i/w, making a full circle in 
the time interval T. As for the prefactor exp(iAwt) 
that multiplies C(7) in the right-hand side of [3a], its 
purpose is to allow, via an appropriate choice of the 
parameter A, the deformed system, see below, to 
have a neater look; however this choice is hereafter 
restricted by the condition that A be real and 
rational, say 

a 


q 


with p and g two coprime integers and q > 0. This 
restriction is essential to guarantee, via [3], that if 
C(t) is holomorphic in T in the (closed) disk 
encircled by the circle C, then z(t) is completely 
periodic (namely, each and everyone of its compo- 
nents is periodic) with the period 


I =g1 [4] 
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The deformed dynamical system is the one that 
obtains from [2] via the trick [3]. It clearly reads as 
follows: 


z —iAwz + expli(A + 1)ut] 
x F(exp(- ian ect) [5] 


lw 


And it is plain, on the basis of the arguments we just 
gave, that this system is isochronous, a sufficient 
condition for the complete periodicity with period 
T, see [4], of its solutions being provided by the 
inequality 

2 

w 
which can clearly be satisfied by initial data situated 
inside an open domain of such data, at least 
provided w is sufficiently large (actually, in all the 
examples reported below no restriction on the value 
of w is required, namely such an open domain exists 
for any arbitrary value of w > 0). 


Examples 


In this subsection we report tersely several examples 
of isochronous dynamical systems; in each case we 
also provide the reference where more information 
can be found. Except when explicitly otherwise 
mentioned, these dynamical systems are to be 
considered in the complex context. 

The first example we report is a Hamiltonian 
N-body problem which is a generalization of a well- 
known integrable (indeed, superintegrable) system 
(see Integrable Systems: Overview). It is characterized 
by the (normal) Hamiltonian 


H(z, p) -i9 p;, + uz) 


Mn N K «D 
TU 2. 225 ps [6a] 


m n=1;mżŻn k=1 n — &m 


and correspondingly by the Newtonian equations of 
motion 


N K k 
Anc Weg = >, x. 142k 
m=1,mén k=1 (Boe E imi 


Here the 3 N(N — 1)K “coupling constants” f$) are 
arbitrary, except for the symmetry restriction 
k) _ f(k 
fit) = ft (see [6a]. | 
The next example we report is a real N-body 
problem in the horizontal plane, characterized by 
the Newtonian equations of motions 


5 : N . 
Fi =wk ^ Tr +2 3 (em T Bs k^) ; 


m=1,mÆn 
5 (Fn * = Ls um (5 " f) — Tum (5. im) 


x 


2 
Tam 


7] 


Here Fa = (Xn,¥n,0) is a real two-vector in the 
horizontal plane, k = (0,0,1) is the unit vector 
orthogonal to the horizontal plane, the symbol ^ 
denotes the (three-dimensional) vector product so 
that k AF, = (—Yn, Xn, 0), and we use Es short-hand 
notation Yam — 7, — Fm entailing pL e = " 4 ri — 2f, 
Fm. Note that these equations are yasslion- and 
rotation-invariant; and they are Hamiltonian, 
although the corresponding Hamiltonian function 
is not of normal type (kinetic plus potential 
energy). 

The N(N — 1) “coupling constants” o, and Bpm 
are of course real, but they are otherwise arbitrary 
except for the symmetry restrictions Qj) = à, 
Bam — By, which are required in order that this 
system be Hamiltonian. If all these coupling 
constants vanish, this dynamical system has a 
clear physical interpretation: it describes the 
motion of N equal, electrically charged, point 
particles, moving in the horizontal plane under 
the effect of a magnetic field orthogonal to that 
plane (in the approximation in which the electro- 
static interparticle interaction is neglected). In that 
case each particle moves on a circle, the center and 
radius of which depend on the initial data, while 
the time taken to go round it is, in all cases, T, see 
[1]. If the 3N(N — 1) coupling constants Bpm 
vanish, Bam — 0, and the 3 N(N — 1) coupling con- 
stants Qam all equal unity, o, — 1, the system is a 
well-known integrable (indeed solvable) system; 
and this is as well the case if the 3 N(N — 1) 
coupling constants Bym vanish, 8,,-—0, and the 
1N(N — 1) coupling constants a, equal minus one 
half, and only act among "nearest neighbors", 
Onm = —3 (6m, n1 + Om, n-1) (see the entry Integrable 
systems in this Encyclopedia). 

Because of its many interesting features as well as 
the neatness of its equations of motion (especially in 
their complex version, see below) the honorarv title 
of “goldfish” has been attributed to this model, 
characterized by the Newtonian equations of motion 
in the plane [7]. A more detailed discussion of it — in 
particular of its behavior for initial data outside of 
the region yielding isochronous motions - is made in 
the next section. 

Several interesting classes of isochronous dyna- 
mical systems are reported in Calogero F. (2004b). 


We only mention here a remarkably general 
example, characterized by the Newtonian equa- 
tions of motion 


K 
Erit = y P (gà + iux) 
k=1 


where z = (z1,..-.,2N) is the N-vector whose com- 
plex components Zn = gu), T the dependent vari- 
ables, while the “forces” os z,z) are required to be 
analytic in all their si ins and to satisfy the 
scaling properties 


f" (az, Z) - a the z) 
which however entail no restriction on the velocity- 
dependence of these forces, namely on the depen- 
dence of f (zz) on the (components of the) 
second, £, of its two N-vector arguments. 

The next example we report is characterized by 
the Newtonian equations of motion 


N M. 
-7 > =e j i | m" mn 
ly + LWT n + 27 f, = — a 


y) 
m-—]1,m€n mn 


where we assume the N dependent variables 7, — 
ř„(t) to be three-vectors (although the property of 
isochronicity of this deformed system would hold no 
less if these were S-vectors, with S an arbitrary 
positive integer) and we use the short-hand notation 
Finn =m — Ta. This system is (perhaps) remarkable 
inasmuch as it represents a (complex) deformation 
of the classical N-body gravitational problem, to 
which it clearly reduces for w=0. 

The next example we report is characterized by 
the following (first-order) equations of motion of 
oscillator type: 

X — tpyWwXn 


= fn(%,y), m= 16 cg iN 


Von "e en tOV nr — £X. y); nm = i, Ye. M 

Here the N-vector x, respectively the M-vector y, 
have as components the N + M complex dependent 
variables x, = x,(t),y,, = Ym(t); the N+M para- 
meters Pn, qm are all nonnegative integers (or they 
could be nonnegative rational numbers); and the 
N +M complex functions f;,,g, are restricted by 
the following conditions (which are sufficient to 
guarantee the isochronicity of this dynamical 
system): 


and gw(x,y) are holomorphic at 


, €y)] = 0, lim, — ole” g(ex, ey)] 
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(3) f(x, y) and g(x, » are polynomial in the Ym; 


(4a) lim... o[e ! "f, (£x, € 2y)] = nondivergent, n = 


lis iz 
(4b) lim. .o[e ^ ng, (£^x,& $y)] —nondivergent, m= 
li 24d. 


In the conditions (4a) and (4b) the notation £x indicates 
of course the N-vector of components ¢€?"x,, and 
likewise £ ły is the M-vector of components & ?",,,. 
Note that this dynamical system, [8], includes the 
Hamiltonian case characterized by the restrictions 


OV (x, y) 


N= M, pn = qn, falx, Y) — ay, oem») 


OV(x, y) 
B OX 
which imply that the equations of motion [8] are 


just the Hamiltonian equations entailed by the 
Hamiltonian function 


N 
A(x, y) = iw 9 ^ PnXnVn V(&, y) 


n=1 


isochronicity being now guaranteed by the following 
conditions on the function V(x, y): 


) V(x, y) is holomorphic at x — 0, y= 0; 

) lim e0 [E ayi (EX; ey)| = 6; 

) V(x, y) is polynomial i in the yn; 

) lim, gle ! V(e?x, €! ’y)| =nondivergent. 


The last two examples we report can be char- 
acterized as assemblies of non-linear harmonic 
oscillators, inasmuch as these two dynamical sys- 
tems (which are actually special cases of more 
general systems) have the remarkable property that 
their generic solutions (namely, all their solutions, 
except for a lower-dimensional set of singular 
solutions in which one or more of the “moving 
particles” shoot off to infinity at a finite time) are 
completely periodic with the fixed period T, see [1]. 
Their Newtonian equations of motion read 


T N M 
Sum — JiUZaa — LW £y. = € ) ) np (C i - 


pel ni 


| N M 
Zn d 3iuZ am -— 2u*z Sum — € ) » £ V (Zin: Zam) 


v=1 p=1 


These are two (different) systems of NM Newtonian 
equations of motion satisfied by the NM complex 
S-vectors Z,? (with S an arbitrary positive integer); 
hence here the index n runs from 1 to N, and the 
index m runs from 1 to M, with N and M two 
arbitrary positive integers, while c is of course an 
arbitrary complex constant (which might actually be 
rescaled away). The dot sandwiched between two 
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S-vectors denotes the standard (Euclidian) scalar 
product, entailing the rotation-invariant character, 
in $-dimensional space, of these equations of 
motion. Since these systems only feature linear and 
cubic forces, these models are remarkably close to 
physics; and they become even more applicable if 
they are written in their real versions, that obtain in 
an obvious manner by setting 


T ii E Men T Passt = &a + ib 


In contrast to what we did for the previous examples, 
let us outline here the derivation of these results. 
Actually the two systems of Newtonian equations 
written above are merely special subcases, corres- 
ponding to appropriate parametrizations of a square 
matrix M (of appropriate rank) in terms of S-vectors, of 
the following nonlinear matrix evolution equation: 


M — 3iwM — 25 M = cM? [9] 


Hence the findings reported above are merely special 
cases of the more general result according to which 
the generic solution of this nonlinear matrix evolu- 
tion equation — with M = M(t) a square matrix of 
arbitrary rank — is periodic with period T, see [1]: 


M(t + T) = M(t) 


And this result is an immediate consequence, via the 
following matrix version of the trick 


wt) — 1 
M(t) = exp(iwt)W(r),7 = s 10] 
W 
of a previous result due to V. I. Inozemtsev, 


according to which the matrix evolution equation 
y" = cy? 


which clearly corresponds to [9] via [10], is 
integrable and all its solutions W(t) are mero- 
morphic functions of the independent variable r. 


The Transition to Deterministic Chaos 


In this section we illustrate, using the real N-body 
problem in the plane characterized by the New- 
tonian equations of motion [7], the behavior of an 
isochronous system of the kind described above 
when the initial data fall outside of the region 
yielding isochronous motions. 

To do this it is convenient to use the complex 
version of the equations of motion [7], that obtain 
from [7] by setting 


Zn ~= Xn + Vin. fs = (39,4, 0), 11] 
k = (0, 0, T) n = Qnm + Ion 


and read as follows: 


pii nm 


i t2 >> 
Zn — Zm 


m= 


N 
[12] 


man 


The main tool of our analysis is the (particularly 
simple) version of the trick appropriate to this 
model, 


£4) = .(T), f= a cams [13a] 
iw 
entailing 
Zn (0 = Ca (0), Zn(0) = G, (0) [13b] 


that relates our equations of motion [12] to the 
equations of motion 


N Ho 
"a Anm Gr, 14 
i 2 Gn E Gm | | 
These equations of motion, together with the initial 
data ¢,,(0), (0) (see [13b]) define the solutions ¢, = 
G(T) in the complex T-plane. The “physical” 
evolution of the points z, — z,(t) as functions of 
the real time variable t is then given by the evolution 
of the corresponding coordinates 6,(7), see [13a], as 
the complex variable 7 travels round and round on 
the circle C in the complex 7-plane, the diameter of 
which of length 2/w, has one extreme at the origin 
T — 0 and the other on the positive imaginary axis at 
T —2ilw. It is therefore clear that the behavior of 
z,(t) as a function of the real, “physical time" 
variable ż depends on the analytic structure of ¢,(7) 
as function of the complex variable 7, in particular 
of the singularities, if any, of this function G,(7) that 
fall in the disk D encircled by the circle C in the 
complex r-plane. 

Let us tersely review the relevant analysis. We 
recall first of all that (it can be proven that) there 
exists in phase space an open region of initial data 
z4,(0),2,(0), characterized by large values of the 
moduli |z,(0) — z,,(0)| of the initial interparticle 
distances and by small values of the moduli of the 
initial particle velocities |2,(0)| (see [14] and [13b]), 
that guarantees (all components G,(r) of) the 
corresponding solution ¢(7) of [14] to be Polo- 
morphic in (a disk of radius p centered at the origin 
T — 0 of the complex 7-plane that includes) the circle 
C, hence the corresponding solution z(t) to be 
completely periodic with period T, see [13a] and 
[1]. This result guarantees the isochronous character 
of this model, [12], for any arbitrarily given assign- 
ment of the coupling constants anm. 


Next, let us restrict, for simplicity, our considera- 
tion to models [12] in which the coupling constants 
dnm are real and nonnegative, 


tay = 0 [15] 


Then the singularities of the generic solution G(7) of 
[14] — which occur at values 7 of r where two 
coordinates ¢,(7) coincide, say G,(75)) — C, (75) =b 
(see the right-hand side of [14]) — are branch points 
characterized by the exponent, say, 

1 


T — "Yuv — 1 4 av 


so that in their neighborhood, namely for 7 z Tp, 


Gir) =b telr — m) -v(r — Th) 


D] 


P» 


k=1 £,m-—0£--m»1 


(s) k+ly+m(1—y) 
Phim(T — T) ^ — 


LUN. [17a] 
GO) = by x3 Uy(T = Th) 
AAA i k+ey+m(I 
n = 
T » H3 wLIr-—T i 
k—1 C=d,, m=0 
nF uv [17b] 


The + sign in front of c in the right- hand side of the 
first, [17a], of these formulas indicates that one sign 
must be chosen for s= pu, the opposite for s=v. 
Note that here the 4--2(N —2)—2N constants 
Tp, b, c, v, bn, Vn are a priori arbitrary — except for 
the obvious restrictions b, Æ b,b, Æ bm — while the 
coefficients NT e can be computed from these 
constants, recursively, by inserting this ansatz, [17], 
in the equations of motion [14]. The fact that the 
number, 2N, of a priori undetermined coupling 
constants equals the number of arbitrary initial data 
for this system of ODEs,’ [14], indicates that this 
kind of branch points, characterized by the expo- 
nents Yn, see [16], is the typical singularity featured 
by the generic solution C(7) of [14]. (Branch points 
with different exponents may appear, but only in 
nongeneric solutions C(7) which, at some value 7, of 
T, feature the coincidence of more than two 
components, say CulTp) = GAT) = Ca (Th)). 

We conclude therefore that the generic solution (T) 
of [14] features a, generally infinite, number of branch 
points, that generally affect each of its components 
Ca(T), and which are characterized, for the class of 
models to which we are restricting attention here, see 
[15]) by (real) exponents »,,;, see [16], which are then 
clearly characterized by the inequalities 


Ü << thers: s 1 


[16] 
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What does this tell us about the generic solution z(t) 
of the equations of motions of primary interest to 
us, [12], in particular about its evolution as function 
of the real “time” variable £? 

To the solution C(r) is associated a Riemann 
surface the structure of which is determined by the 
character and distribution of the branch points of 
C(r) in the complex 7-plane (each of which is 
generally featured by each component C,(7) of ¢(7), 
although generally not in the same way: see [17]), 
and we know from [13a] that the values taken by 
z(t) as t evolves from t=0 towards t —oo coincide 
with the values taken by ¢(7) as the independent 
variable 7 travels, on that Riemann surface asso- 
ciated with (T), counterclockwise round and round 
on the circle C defined above (the diameter of which 
lies on the imaginary axis in the complex 7-plane, 
with one end at t=O and the other at 7 — 2i/u), 
employing a period T, see [1], to make each full 
round. Hence the behavior of the solution z(t) of 
[12] depends on the structure of the Riemann 
surface associated with the corresponding solution 
C(r) of [14], and specifically on the number of 
different sheets of that surface that are visited as one 
travels on it before returning, if ever, to the main 
sheet from which the travel started at t=7 =O. 

If no other sheet is visited besides the main one, 
the corresponding solution z(t) is of course periodic 
with period T, see [1] and [13a], 


z(t + T) = z(t) [18] 


This happens provided no branch point is featured 
by (T) on its main sheet inside the circle C; and, as 
already indicated above, it has been proven (even in 
the more general case with arbitrary coupling 
constants 4,,) that there is an open region having 
full dimensionality in the phase space of initial data, 
see [13b], that yields such an outcome, implying the 
isochronicity of the model characterized by the 
Newtonian equations of motion [12]. This region 
R of initial data has a boundary - a lower- 
dimensional domain in the phase space of initial 
data — out of which emerge motions leading, at a 
time 7; smaller than T, to a “particle collision", say 
z, (ty) md Zub). 

The character of the solution z(t) yielded by initial 
data outside of the region R depends on the 
structure of the Riemann surface associated with 
the corresponding solution ¢(r). This is mainly 
determined by the values of the branch point 
exponents Ym, which are themselves determined by 
the values of the coupling constants apm, see [17] 
and [16]. Let us focus on the (more interesting) case 
in which these constants Aum are rational numbers, 
entailing that the coefficients ^,,, determining the 
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character of the branch points are as well rational, 
see [16], so that each of the cuts associated with 
them opens the way, in the Riemann surface, to a 
finite number of sheets. There are then two 
possibilities, each generally characterized by open 
regions of initial data having full dimensionality in 
phase space, the boundaries of which always are 
(lower-dimensional) domains out of which emerge 
motions leading, in a time £j smaller than T, to a 
*particle collision". 

One possibility is that the number B of sheets 
visited before returning to the main sheet be finite, 
B < oo; the corresponding solutions z(t) are then 
completely periodic with period T —(B + 1)T, 
z(t + T) — z(t). 

Another possibility is that the number of new 
sheets visited be unlimited, namely that the structure 
of the Riemann surface be such that, by traveling 
round and round on it along the circle C one never 
returns back to the main sheet. This can happen, 
even if the exponents Y„m are all rational so that via 
the cuts associated to each of them access is gained 
to only a finite number of new sheets, because of the 
possibility that an infinity of branch points be 
located inside the circle C on the infinite sheets 
associated to these branch points, via a never ending 
mechanism of branch points nesting. Whenever this 
happens the corresponding solution z(t) is aperiodic; 
and it is moreover likely that it then be chaotic, in 
the sense of displaying a sensitive dependence on its 
initial data. Indeed this will happen whenever some 
ones out of this infinity of branch points fall 
arbitrarily close to the contour C, because then a 
minute change in the initial data, to which there will 
correspond a minute change in the pattern of these 
branch points of (T) in the complex z7-plane, will 
cause some relevant branch point to cross over from 
outside the circle C to inside it, or viceversa, and this 
will eventually affect quite significantly the time 
evolution of z(t), by causing a change in the 
sequence of sheets that get visited by traveling 
along the circle C on the Riemann surface associated 
to the corresponding C(7). 

This phenomenology has a clear “physical inter- 
pretation”, which can be qualitatively understood 
as follows. The N-body problem characterized by 
the Newtonian equations of motion [12] generally 
yields confined motions, the trajectory of each 
particle tending to wind round and round - it 
would indeed reduce to a circle were it not for the 
interaction with the other particles. A possibility, as 
we know, is that this N-body motion be completely 
periodic, with the same period T that characterizes 
the circular motion of each particle when the two- 
body interparticle interaction is altogether missing 


(455, — 0). Another possibility, in the case discussed 
above with rational coupling constants, is that there 
exist other motions which are as well completely 
periodic, but with periods which are integer multi- 
ples of T. A third possibility, which cannot a priori 
be excluded, is that there also exist motions which 
are aperiodic but in some way overall ordered, 
perhaps featuring trajectories that eventually wind 
up around limit cycles. And still another possibility 
is that the motions described by the solution z(t) be 
aperiodic and disordered. In this case the physical 
mechanism causing a sensitive dependence on the 
initial data can be understood as follows. Such 
disordered motions necessarily feature near misses, 
in which, typically, two particles pass quite close to 
each other (while the, probability that an actual 
collision occur among point particles moving in a 
plane is of course a priori nil). Such a near miss in 
the motion described by z(t) corresponds — see the 
discussion. above — to a branch point of the 
corresponding solution ¢(7) occurring quite close 
to the circle C in the complex 7-plane (which is the 
one-dimensional region of the two-dimensional 
complex 7-plane in which the values of (7) 
correspond to the values z(t) describing the motion 
of physical particles moving as functions of the 
time 7); and in the generic case of a two-body near 
miss, there is a correspondence between the fact 
that such a branch point occur just inside, or just 
outside, the circle C, and the way the particles pass, 
on one or the other side, by each other. Likewise, 
the tiny change in the initial data that causes, in the 
context of the solutions (T) — see the discussion 
above — a branch point of ¢(r) to pass from inside 
to outside the circle C, or viceversa, corresponds, in 
the context of the “physical” solutions z(t), to a 
change occurring in the corresponding near miss, 
from the case in which the two particles involved in 
it slide by each other on one side to the case in 
which they instead slide by each other on the other 
side — entailing a significant change in the sub- 
sequent motion (indeed, the closer a near miss, the 
more it affects the motion, due to the singularity 
of the two-body interaction at zero separation, 
see [12]). 

The phenomenology outlined here does indeed 
occur in this goldfish model. It also occurs — rather 
similarly if more simply, since in this case only 
square-root branch points occur, irrespective of the 
values of the coupling constants — in the model [6] 
with K — 1. Indeed, it is clear that this phenomen- 
ology provides a paradigm of rather general applic- 
ability for the transition from isochronicity to 
deterministic chaos, indeed perhaps for the generic 
onset of deterministic chaos. 


See also: Bifurcations of Periodic Orbits; 
Calogero-Moser-Sutherland Systems of Nonrelativistic 
and Relativistic Type; Integrable Systems: Overview; 
Quantum Calogero-Moser Systems; Synchronization of 
Chaos. 
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Introduction 


In this article we consider families of linear 
differential equations whose monodromy data do 
not depend on the parameters. Such families are 
called isomonodromic deformations of any of the 
equations of the family (for the definitions of a 
regular and Fuchsian linear system and of 
their monodromy groups, see Riemann—Hilbert 
Problem). 


Schlesinger’s Equation 


The best-studied example of an isomonodromic 
deformation is the Fuchsian system on Riemann’s 
sphere CP! =C U oo considered by L Schlesinger: 


dX (RA A 
dt bis A [1] 


Here the poles a; € C are free parameters and the 
matrices-residua A; depend analytically on 
a:=(a1,...,4p41); therefore, system [1] is in fact a 
family of linear systems which is an analytic 
deformation of the system obtained for aj=a). 
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One can think of system [1] as defined by the 
Pfaffian system 


/-d(t-a) [2] 


dX = u4X, 
t — di 


i, = 


Suppose first that the poles a; vary within 
small nonintersecting disks of the points dj, SO 
small that the standard system of generators of 
the monodromy group could be defined by one 
and the same contours for all values of the 
parameters a; (see Figure 1 from Riemann-Hilbert 
Problem). Suppose also that one chooses oo as 


base point and that one has 
X| = 1 [3] 


(where I is the identity matrix) for all values of the 
parameters aj. Finally, suppose that all matrices A; 
are nonresonant, that is, without two eigenvalues 
differing by a nonzero integer. Then the following 
conditions are necessary and sufficient for system [1] 
to be isomonodromic: 


pl . 
dA;(a) — `o |Ai(a), A;(a)]| d(a; E aj) 
pla ^ 
a ny E oa | |4] 
This system (called Schlesinger’s equations) results 


from the Frobenius integrability condition 
du, — us Aw, of system [2]. 
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Remarks 1 


(i) To find the matrices-residua A; as functions of a 
and given their values Aj|, ,, is a Cauchy 
problem. It is solvable for a close to a? and 
the matrices A; are analytic in a. 

(ii) The differential of A; being a commutator 
[A;,.], the matrix A; remains within its con- 
jugacy class throughout the deformation. 

(iii) Schlesinger’s equations are the necessary and 
sufficient conditions for isomonodromy also in 
the case when system [1] has a logarithmic 
pole at oo whose matrix-residuum does not 
change throughout the deformation. In this 
case the solution to system [1] in its Levelt’s 
decomposition at oo (see Riemann-Hlilbert 
Problem) equals U4(1/t)t P»1 ^ *G, where 
D» is a diagonal matrix with integer entries, 
Es. is an upper-triangular constant matrix, and 
U,, is holomorphic at oo and such that 


Ul = 


Definition 2 The deformation satisfying condition 
[4] with initial condition [3] for the solution to 
system [1] is called the normalized Schlesinger 
deformation. 


Remark 3 When the matrices-residua Aj; are 
nonresonant, then every isomonodromic deforma- 
tion of system [1] with a; =a; is either the normal- 
ized Schlesinger deformation or is a nonnormalized 
Schlesinger deformation, that is, obtained from 
the normalized one by a change of variables 
X > C(a)X, C(a) € GL(n, C). In this way, one has 
X|; = C(a) instead of [3] and the deformation is 
described by a Pfaffian system with a form of the 
kind wu, =w; + ew y;(a)da;. 


Example 4 The following one-parameter Fuchsian 
family is an isomonodromic Schlesinger deformation: 


ax — [EX A 
a> (Sta) 


1 


Here the matrices A; are constant and the parameter 
b takes nonzero values. Indeed, one either checks 
directly that there holds condition [4] or one makes 
the change of time (which does not change 
monodromy) ft-— bt after which the parameter 
b disappears. 


A A Bolibrukh has shown that in the resonant 
case every isomonodromic deformation of a Fuch- 
sian system is described by an integrable Pfaffian 
system with 1-form w=wn +wm, where the mero- 
morphic 1-form wm vanishes at oc and has poles of 


orders <r; along the hyperplanes [x — a; = 0]; here r; 
is the largest nonzero integer difference between two 
eigenvalues of the matrix Aj. 

Consider now Schlesinger’s equation in the global 
situation, that is, when the poles a; belong to the 
universal covering Z of the space C”\A, where A is 
the “diagonal,” that is, the union of all sets 
(a;—ajj, i xj. Suppose that the matrices A; are 
nonresonant. There are values of a (their set is 
denoted by ©) for which some entries of some of the 
matrices-residua A; tend to oc. Typically, at such 
points the matrices A; have poles of second order; 
this is a result due to Bolibrukh. Indeed, set 
A; — Q;'J,Q;, where J; is the Jordan normal form 
of Aj; hence, this is a constant matrix; we assume 
that O; € SL(n, C). Typically, at points of O the 
matrices O; and O;' have simple poles, which 
makes a pole of second order for A;. 

B Malgrange and, independently, T Miwa have 
proved that system [4] is completely integrable and 
that it has the Painlevé property: “The only 
movable singularities of its solutions are poles." 
(The fixed singularities of the solutions are, by 
definition, along the points of Z which are over A. 
The positions of the movable singularities depend 
on the initial condition, that is, on the values of the 
matrices A; for a-—a?.) In other words, the 
solutions to Schlesinger's equation are matrices 
meromorphic on Z. 


Theorem 5 The set O of movable singular points 
of the Schlesinger equation is the set of zeros of a 
function T (the Miwa r-function) holomorphic on Z 
and such that 


1 tr(A;(a)A;(a))d(a; — aj) _ m 
5 a es -— = dlog(r(a)) 


Some improvements of this result are due to 
Malgrange and Bolibrukh. 


Isomonodromy and Confluence 


The idea to consider a linear system of ordinary 
differential equations with a pole of order higher than 
1 as embedded into a family of Fuchsian systems with 
confluence of the poles has been proposed by V I 
Arnol'd in 1984 and independently by J-P Ramis in 
1988. The idea has been used by A Duval, B Khesin, 
A A Glutsyuk, and other authors. In particular, it is 
interesting to relate the Stokes multipliers (defined in 
the next section) of the system obtained as a result of 
a confluence to the monodromy groups of the 


Fuchsian systems obtained for values of the para- 
meters before the confluence occurs. 


Example 6 Consider the one-parameter family of 
linear systems: 


(^ — A)dX/dt = (A(A)t + B(A))X [5] 


Here the matrices A, B, and X are nxa. 
Suppose that ? € C (Le, we do not consider 
singularity at oo), A€(C,0). Then for AZ 0 the 
system is Fuchsian - it has two logarithmic poles at 
+ !/* whose confluence for A — 0 gives as a result a 
pole at 0 which might be of order 2 if B(0) Z 0 or 1 
if B(0) — 0. 


In this section we consider only the situation 
when the family producing the confluence is 
isomonodromic for values of the parameters before 
the confluence. 


Example 7 This is the case of family [5] with B ~ 0 
and A being a constant nonresonant n x n matrix. 
Indeed, the change of time t — \!/7?t(+) transforms the 
family into the family (¢* — 1)dX/dt 2 tAX 
(independent of A) which is a Fuchsian system (at oo 
as well). 


Suppose now that £ € CP! (i.e., we consider the 
singularity at oo as well). Hence, the monodromy 
operator Mẹ around oc is independent of À up to 
conjugacy (it is conjugate to exp(—271A)). On the 
other hand, consider the monodromy operator M' 
defined by a contour circumventing counterclock- 
wise both poles at +\'/7 (one can choose as such 
a contour a circumference centered at the origin 
and of sufficiently large radius). It equals M4, 
and it is well defined for \=0 as well. (This is 
not the case of the monodromy operators defined 
by contours circumventing only one of the poles 
at +\!/2.) Hence, up to conjugacy M' is indepen- 
dent of A. As M’ is;in a sense the only 
monodromy operator that can be defined by 
a contour depending continuously on A for all 
A € (C,0) and not passing through a pole of the 
system, one can say that the family is strongly 
isomonodromic. 


Example 8 Consider now family [5] with z — 2, 


a=(o 2) "-(5 0) 


where d € C. For \ Z 0 the family is isomonodromic — 
the change of time (*) followed by the change of 


variables 
1/2 
Xe ir 1)xen 
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brings the family to the form 


eo (8 )e(8 2) 


which is independent of A, hence, isomonodromic. 
However, the change of variables (*«) is not defined 
for \=0. The monodromy operator M’ (defined as 
above) is scalar for \=0 and conjugate to a Jordan 
block of size 2 for A Z 0. Hence, the family is not 
strongly isomonodromic. 


The following example is closely connected 
to singularity theory. It has been suggested by 
F Pham. 


Example 9 Consider the Abelian integrals 
fj = [ ise +sx+t) and 
Ll = ] sexi + sx +t) 


taken over a closed contour y belonging to a 
nonsingular fiber of the function f(x) —x? + sx +t. 
Suppose that x? +sx+t#0 on y. Obviously, I 
and I, depend only on [y], the class of homotopy 
equivalence of y. Set 


xe+sx+t= (x — x1)(x — x2)(x — x3), 
xj; = xls, t) 


Then one has 
3 
I, = 2i S kjak (3x? +), R12 
j=1 


where the integers ó,; depend only on [5] (the 
contour y is homotopy. equivalent to a linear 
combination with integer coefficients of small 
loops around the roots of f; the integral along such 
a loop is computed using residua). Note that 


x; := dx;/dt = -1/ (3x? +s) 


An easy computation shows that the integrals 1, I» 
satisfy the following Picard—Fuchs system of differ- 
ential equations: 


—th, — 2s5/3 = 214 /3 
2871, /9 — th = 15/3 


The system admits also a presentation of the form 


eS) = Cem a)i) 


Here the unknown variables form a vector column 
of length 2; to obtain a 2 x 2 matrix, one has to 
choose another contour y (linearly independent 
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with y as a linear combination of the loops around 
the roots x;) which gives the second column of the 
matrix. The system is strongly isomonodromic - its 
matrix-residuum at oo equals diag(2/3, 1/3); hence, 
the monodromy operator M' up to conjugacy equals 
diag(exp(—471/3), exp(—271/3)). 


A A Bolibrukh has considered the possibility of 
confluence of poles in Schlesinger's equation 
(i.e., the possibility to have equalities of the form 
dj — à; in system [1]). He has considered the so-called 
normalized —isomonodromic  confluences, that 
is, isomonodromic confluences defined by Pfaffian 
systems with coefficient forms w=w,+w, alone 
(see the previous section). He has shown that 
a normalized isomonodromic confluence of singular 
points of Fuchsian systems of linear differential 
equations on Riemann’s sphere can only lead to 
a system with regular singular points. This is a 
partial answer to a problem stated by V I Arnol'd: 
how to express a system with regular singular 
points as a limit of Fuchsian systems? 


Other Results 


In the case of a linear system with irregular singular 
point, isomonodromy means that the formal mono- 
dromy and the Stokes multipliers do not change 
throughout the deformation. The formal mono- 
dromy can be computed from the formal normal 
form (the latter can be found algorithmically; this is 
due to H Turrittin). Consider, for simplicity, the 
nonresonant case, that is, the case when the leading 
matrix in the Laurent series of the system at the 
singular point has distinct eigenvalues (this defini- 
tion differs from the one in the case of a Fuchsian 
singular point). The Stokes multipliers are linear 
Operators acting on the solution space. They are 
defined as follows: there exist sectors of maximal 
opening centered at the singular point on each of 
which the solution is uniquely defined by its 
asymptotic development. Two solutions Xi, X; 
having one and the same asymptotic development 
in two overlapping sectors are related by X, = X2C, 
where C is a Stokes multiplier. The monodromy 
operator is expressed as a product of the operator 
of formal monodromy and the Stokes multipliers. 
Isomonodromic deformations of systems with irregu- 
lar singular points have been constructed by B 
Malgrange. Isomonodromic deformations have been 
used by Y Sibuya and C H Lin and by Y Sibuya and 
T ] Tabara to investigate Stokes multipliers. 

At the beginning of the twentieth century, 
P Painlevé and B Gambier have classified the 
differential equations of second order, 


fic = R(2,1,16.) [6] 


(where R is analytic in x and rational in u and z) 
whose solutions do not have branch-type movable 
singularities. From the 50 equations (up to local 
transformation) discovered by them only six are not 
reduced to linear ones. These are the so-called 
Painlevé equations. They appear often as isomo- 
nodromy conditions for families of linear differen- 
tial equations and this has given the idea to 
develop the isomonodromic deformation method. It 
consists in associating with eqn [6] a linear system 


dV /dà = A(A, x, u, u, ) V [7] 


with  matrix-valued coefficients rational in A. 
The deformation of the coefficients in x is described 
by eqn [6] in such a way that the monodromy data of 
system [7] remain the same. Thus, the monodromy 
data of system [7] are first integrals of eqn [6]. 


Example 10 The Painlevé II equation 
Hu. — XM —2w =v 
is associated with the system 
41Au — 2 i 
iAu — 2u, — — 
= . |^ [wy 
Ü oo la 3 
—41Au — 21, + X 41A^ + ix + 2147 


dp [| —4i2 — ix — 2iv? 
dà 


The idea to present the Painlevé equations as 
isomonodromy conditions originate from the works 
of Fuchs (1907) and Garnier (1912). It has been 
used, for example, in the papers of Flaschka and 
Newell (1980), Jimbo and Miwa (1981), and Its and 
Novokshenov (1986). 


See also: Holonomic Quantum Fields; Integrable 
Systems: Overview; Painlevé Equations; 
Riemann-Hilbert Problem; WDVV Equations and 
Frobenius Manifolds. 
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Introduction 


A "link" is a finite family of disjoint, smooth, 
oriented or unoriented, closed curves in R? or 
equivalently S?. A “knot” is a link with one 
component. The “Jones polynomial" V;(t) is a 
Laurent polynomial in the variable Jf which is 
defined for every oriented link L but depends on 
that link only up to orientation-preserving diffeo- 
morphism, or equivalently isotopy, of R?. Links can 
be represented by diagrams in the plane and the 
Jones polynomials of the simplest links are given 
below. 


‘om 
OO- -g 


GÀ = t 8-14 


T2 = —Jr(14 £) 
" | 


e 


The Jones polynomial of a knot (and generally a 
link with an odd number of components) is a 
Laurent polynomial in t£. 

The most elementary ways to calculate V_(t) 
use the "linear skein theory" ideas of Conway 
(1970). Indeed, it is not hard to see by induction 
that V(t) is defined by its invariance under 
isotopy, the normalization Vo(t)=1 and the skein 
formula 


+1-—t+f? 


: 
p t 


1 1 
Sih Pd = (wi-— 1 ¥ 
, La i... (vi =) Lo 


which holds for any three oriented links having 
diagrams which are identical except near one crossing 
where they differ as below. 


ba »* Luz 
p LEE Aa 
L, L Lo 
As such the Jones polynomial resembles the 


Alexander (1928) polynomial Az(t) which can be 
calculated in exactly the same manner as V_(t) 
except that the skein relation becomes 


Aj — AL = (vi- ^) 


A two-variable generalization Pj. of both A; and 
Vi, sometimes called the HOMFLYPT polynomial, 
was found in Freyd et al. (1985) and Przytycki and 


Traczyk (1988). It satisfies the most general skein 
relation 


XP, -FyPp + zP = 0 


for homogeneous variables x, y, and z. 

The other skein-like definition of V; was found in 
Kauffman (1987). Begin with unoriented link dia- 
grams up to planar istotopy. The Kauffman bracket 
(L) of such a diagram is calculated using 


KA) OF A) 


where the (-) notation means that the relation may 
be applied to that part of the link diagrams inside 
the bracket, the rest of the diagrams being identical. 
If (L) were to be an invariant of three-dimensional 
isotopy it is easy to see that 


Omen 
which further implies 
(SO)= A3()) 


Thus, (L) cannot be a three-dimensional isotopy 
invariant as such. However, if L is given an 
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orientation (then called L), a simple renormalization 
solves the problem and it is true that 


(x) Vi.(A*) = A? writhe (L) (T Y 


where writhe (L) is the sum over the crossings of L 
of + for a positive crossing (X) and -1 for a 
negative crossing (X). 

The formula (*) is readily proved by induction but 
a more structural proof will be discussed later on, 
connected with physics. If the crossings in a link 
alternate between over and under as one follows the 
string around, the highest and lowest degree terms in 
the Kauffman bracket can readily be located. This 
led to the proof of some old conjectures about 
alternating knots in Murasugi (1987), Kauffman 
(1987), and Thistlethwaite (1987). 

The Kauffman two-variable polynomial F; (a, x) is 
defined in Kauffman (1990) by considering the 
linear skein relation involving all four possibilities 
at a crossing: 


Xx KK )( 
L, L Ls la 


This polynomial contains Vj (T) as a specializa- 
tion but not the Alexander polynomial. 

The above polynomials are quite powerful at 
distinguishing links one from another, including 
links from their mirror images, which corresponds 
for the Jones polynomial to replacing t by t}. More 
power can be added to the polynomials if simple 
geometric operations are allowed. “Cabling” entails 
replacing a single strand with several parallel copies 
and the polynomials of cables of a link are also 
isotopy invariants 1f attention is paid to the writhe 
of a diagram. 

The following problem, however, is open at the 
time of writing this article: *Does there exist a knot 
in R?, different from the unknot O, whose Jones 
polynomial is equal to 1?" 

For links with more than one component, it is 
known (Thistlethwaite 2001, Eliahou et al. 2003) 
that the answer to the corresponding question is yes, 
the simplest example being: 


One of the reasons that the question above has 
not been answered is presumably that, unlike with 
the Alexander polynomial, we have little intuitive 
understanding of the meaning of the “t” in Vj (t). 
Perhaps, the most promising theory in this context is 
in Khovanov (2000) where a complex is constructed 
whose Euler characteristic, in an appropriately 
graded sense, is the Jones polynomial. The homol- 
ogy of the complex is a finer invariant of links 
known as *Khovanov homology." 


Braids 


A braid (see Birman (1974)) on n strings is a 
collection of curves in R? joining n points in a 
horizontal plane to the 7» points directly below 
them on another horizontal plane. If the end- 
points of the braid are on a straight line, the 
braid can be drawn as in the example below 
(where n=4). 


X 
"o. 


The crucial property of a braid is that the tangent 
vector to the curves can never be horizontal. Braids 
are considered up to isotopies which are supported 
between the top and bottom planes. 

Braids on z strings form a group, called B,, under 
concatenation (plus some isotopy) as below: 


—»— aB = 


Artin's presentation (Birman 1974) of the braid 
group is on the generators 21,02,...,0,. 1 With the 
relations 
for1€:i€n-2 
if i-; > 2 


0j0j;4-10; — Oj419jO;+1 
0,0; = O;0; 


Thus, to find linear representations of B,, it suffices 
to find matrices p1,92,..., 5,1 satisfying the above 
relations (with o replaced by p). One such representa- 
tion (of dimension n) called the (nonreduced) Burau 
representation is given by the row-stochastic matrices 


1-—t t O QO 
1 0 0 0 
a= 0 0 1 0 
G 0 0 ws 1 
1 0 0 0 
0 f= f Ü 
0 1 0 0 
?»-—10 O 0 1 
0 0 0 1 
1 O0 0 
0 1 0 
pa] = 
0 l= f 
A m 1 0 


This representation is known not to be faithful for 
n > S but faithful for n < 3. The case n — 4 remains 
open. (See Moody (1991), Long and Paton (1993), 
and Bigelow (1999)). 

Braids can be viewed in several ways, which lead to 
several generalizations. For instance, identifying the 
vertical axis for a braid with time and taking the 
intersection of horizontal planes with the braids shows 
that elements of B, can be thought of as motions of n 
distinct points in the plane. Thus, it is natural that 


B, = q1((C NAJ/S") 


when A is the set ((zi,...,2,)|z; =z for some i Z j} 
and the symmetric group $, acts freely on C”\A by 
permuting coordinates. But A is the zero-set of the 
frequently encountered function 


ME - gx) 


so the braid group may naturally be generalized as 
the fundamental group of C" minus the singular 


The Jones Polynomial 181 


set of some algebraic function (Birman 1974). Or, 
motions of points can be extended to motions of the 
whole plane and a braid defines a diffeomorphism of 
the plane minus z points. Thus, the braid group may 
be generalized as the “mapping class group” of a 
surface with marked points (Birman 1974). 


The Temperley-Lieb Algebra 


If 7 € C one may define the algebra TL(n, 7) with 
identity 1 and generators e1,e2,...,6, ; subject to 
the following relations: 


2 
€j€j41€; = Te; 


€j€; = jei if | — j| 2 2 


Counting reduced words on the e;’s shows that 


S251 e) 
—~n+1\ n 


and in Jones (1983) it is shown that these numbers, 
the Catalan numbers, are indeed the dimensions of 
the Temperley—Lieb algebras. In the obvious way, 
TL(n,r) C TL(n+ 1,7). If T! is not in the set 
{4cos? qr;q € Q}, TL(n,r) is semisimple and its 
structure is given by the following Bratteli diagram: 


dim{TL(n, 7) 


1 


S 

s 
NS 
ZNAN, 
NINZAN, 


ISANSA S 
CELSUS 


where the integers on each row are the dimensions 
of the irreducible representations of TL(n,7) and the 
diagonal lines give the restriction of representations 
of TL(n,r) to TL(n — 1,7). These representations 
are naturally indexed by Young diagrams with n 
boxes and at most two rows: H} with the 
diagonal lines in the Bratteli diagram corresponding 
to removal/addition of a box. The dimension of the 
representation corresponding to the diagram whose 
second row has r boxes (r < n) is 


e)- C 
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One may attempt to make TL(n,T) into a 
C'-algebra and look for Hilbert space represen- 
tations (with e; #0), by imposing e*=e;. From 
(Wenzl 1987), this is only possible (for all n) when 


1. T€ R, 0 « 7 € 1/4, or 
2. T € (4cos? r/m, m —3,4,5,...]). 


The proof uses the fact that f,, inductively defined by 


Plin +1 


[TE = fa = 7 +2], £ É Baila 


must be an orthogonal projection with ejf, = frei = 0 
for i € n. These f,, are sometimes called Jones-Wenzl 
idempotents. (Here 7! —2 +q? +q” and for this 
and later formulas we define the quantum integer 
[n], — (a" — q ")/(a — q>). 

When 7^ —4cos? (m/m), the Hilbert space repre- 
sentations decompose according to Bratteli diagrams 
obtained by truncating — eliminating the 1 on the 
mth row, and all representations below and to the 
right of it, so that for 7»; — 7 we would obtain 


1 


Mc, 

7N 
ÁN 
ANAN, 
NINN 
INEN 
Nf WEN, 
FW TM. u^ 


In terms of Young diagrams, this corresponds to 
only taking those diagrams whose row lengths differ 
by at most m — 2. The existence of these Hilbert 
space representations is from Jones (1983). 

The Temperley-Lieb algebras arose in Jones (1983) 
as orthogonal projections onto subfactors of II; factors. 
As such the Hilbert space structure was manifest. The 
trace on a II, factor also yielded a trace on the TL(n, 7). 

To be precise, there is for each m a unique linear 
map tr: TL(n,7) — C with: 


1. tr(1) —1 
2. tr(ab) — tr(ba) 


3. tr(xe,,1) — Ttt(x) for x € TL(n + 1,7). 


This trace may be calculated either from (1), (2), 
and (3), or using the representations, as a weighted 
sum of ordinary matrix traces. The weight for the 


representation of TL(z, 7), the second row of whose 
Young diagram has r boxes, is 


i-r 1. 
(12],) 


Thus, if x € TL(n,7) and m, is the (") — 


Y 
dimensional irreducible representation, then 


(i) 


1 [n/2) 


$n — r - 1], trace (m,(x)) 


mix) = Grey d 


One also has 


so that the disappearance of the “1” from the 
Bratteli diagram is mirrored by the vanishing of the 
trace of the corresponding projection. 

Positivity of tr, tr(a*a) > 0, is responsible for all the 
Hilbert space structures. To explicitly construct the 
Hilbert space representations, one may use the GNS 
construction: take the quotient of the *-algebra by the 
kernel of the form (a, b) =tr(b*a) which makes this 
quotient a Hilbert space on which TL(z,7) will act 
with the e;'s as orthogonal projections. Explicit bases 
can be obtained easily if desired, using paths on the 
Bratteli diagram, or Young tableaux. 

A useful diagrammatic presentation of TL(n,7r) 
was discovered in Kauffman (1987). A (Kauffman) 
TL diagram (for non-negative integers m and n) is a 
rectangle with n marked points on the top and m on 
the bottom with nonintersecting smooth curves 
inside the rectangle connecting the boundary points 
as illustrated below. 


af 


A (5, 7)-diagram 


Two Kauffman TL diagrams are considered the same 
if they connect the same pairs of boundary points. 

The vector space TL(m,n,6) with basis the set of 
(m,n) diagrams, and 6 € C, becomes a category with 
this concatenation together with the rule that closed 
curves may be removed, each one counting a 
(multiplicative) factor of 6. We illustrate their 
product in TL(m,n,6) below: 


AA 
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Of special interest is the algebra TL(n, n, ô). If we 
define E; to be the diagram below: 


1 2 i i+ 


then E? = OF;, FE; Ej41E; = Ej, and E;E; = E;E; for 
|i—j|>2. Thus, provided 640, we have an 
isomorphism between TL(n,6~) and TL(n,n,6) by 
mapping e; to (1/6)E;j. 

One of the nicest features of the Kauffman 
diagrams is that they yield simple explicit bases for 
the irreducible representations. To see this, call a 
curve in a diagram a “through-string” if it connects 
the top of the rectangle to the bottom. Then all 
(m,n) diagrams are filtered by the number of 
through-strings and if we let TL(m,n,k,6) be 
the span of (m,n) diagrams with at most k 
through-strings, we have TL(k,n,6)TL(n, m, k, 6) C 
TL(k,m,k,6). Thus, V, 4, = TL(n,m,m,6)/TL(n, m, 
m — 1,6) is a TL(n,6?)-module, a basis of which is 
given by (m,n)-diagrams with m through-strings 
(m € n). The number of such diagrams is (7) — 
(, ,) and it follows from Jones (1983) that all these 
representations are irreducible for “generic” 6 (i.e., 
6 ¢ {2 cos Qr}) and that they may be identified with 
those indexed by Young diagrams as below: 


-4— IT] 
V m HHH 
-—— N- m 


The invariant inner product on V,, ,, is defined by 
(v, w) — w*v for the natural identification of Vin, m 
with C (* is the obvious involution from (m,n) 
diagrams to (n,m) diagrams.). 


The Original Definition of V; (t) 


Given a braid 8 € B, one may form an oriented link 
B called the closure of 8 by tying the top of the braid 
to the bottom as illustrated below: 


1-8 


All oriented links occur in this way (Birman 1974) 
but if a € B,, aBa and 8o?! (in B,,1) have the 


same closure. 


Theorem 1 (Markov) (Birman 1974). Let ~ be tbe 
equivalence relation on ||7.., B, (all braids on any 
number of strings) generated by tbe two "moves" 
8 ~ Bo? and B ~ oor. Then f ~ B; if and only 
if the links B1 and 85 are the same. 


It is easily checked that, if 1,e1,e2,e3,... satisfy 
the TL relations of the section *The Temperley-Lieb 
algebra,” then sending o; to (t + 1)e; 1 (with 7! = 
2--t-4- t!) defines a representation p, of B, inside 
TL(n, 7) for each n. The representation is unitary for 
the C*-algebra structure when 7^ =4cos? r/n, 
n — 3,4,5,... (and t=e”"/”). It is an open question 
whether p, is faithful for all n. It contains the Burau 
representation as a direct summand. 

Combining the properties of the trace tr defined 
on TL with Markov's theorem, one obtains imme- 
diately that, for o € B,, the following function of t 
depends only on á: 


iv” s 
=) JE elpa) 


(here e € Z is the “exponent sum” of a as a word on 
015035 06s OL) 

A simple check using the (oriented) skein-theoretic 
definition of the Jones polynomial shows that this 
function of £ is precisely Vg(t). This is how Vj (t) 
was first discovered in Jones (1985). 

Although less elementary, this approach to Vj (f) 
does have some advantages. Let us mention a few. 


v- 


1. One may use representation theory to do calcula- 
tions. For instance, using the weighted sum of 
ordinary traces to calculate tr as in the section 
“The Temperley-Lieb algebra,” one obtains read- 
ily the Jones polynomial of a torus knot (i.e., â 
where a—(0192-:::05 1)! € Bp if p and q are 
relatively prime). It is 


tb-1)(4-1)/2 


— 21 agti — +pt+q 
m: (1-7? t $4) 
2. If one restricts attention to links realizable as â for 

a € B, for fixed n, the computation of V(t) can be 

performed in polynomial time as a function of the 

number of crossings in á. Thus, one has computa- 

tional access to rather complicated families of links. 
3. Unitarity of the representation when t—e* can 
be used to bound the size of |V; (£)|. For instance, 
if œ € B, and Valt) 2 — Vt — (1/2), then a 
is in the kernel of p,, and |V,(e**"/")| « 
(2 cos m/n)! for any other 5 € B,. 
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The representation of the braid group inside the TL 
algebra should be thought of as an extension of the 
Jones polynomial to *special knots with boundary." 
The coefficients of the words in the e;'s (or equivalently 
the Kauffman TL diagrams) are all invariants of the 
braid. We can further remove the braid restriction and 
consider arbitrary knots and links with boundary, 
known as “tangles” (Conway 1970). 


A 3-tangle 


Tangles may be oriented or not and their 
invariants may be evaluated either by reduction to 
a system of elementary tangles using skein relations 
or by organizing the tangle and representing it in an 
algebra. See Turaev (1994). 

A similar algebraic approach is available for the 
HOMFLYPT and Kauffman two-variable polyno- 
mials. The algebra playing the role of the TL algebra 
is the Hecke algebra for HOMFLYPT (Freyd et al. 
1985, Jones 1987) and the BMW algebra (Birman and 
Wenzl 1989, Murakami 1990) for the Kauffman 
polynomial. The BMW algebra was discovered after 
the Kauffman polynomial in order to provide an 
analog of the TL and Hecke algebras. For detailed 
analysis of the Hilbert space and other structures for 
both Hecke and BMW algebras, see Wenzl (1988) and 
Wenzl (1990). 


Connections with Statistical Mechanics 


One might say that turning a knot into a braid 
organizes the knot by “putting it on a lattice," 
thereby creating a physical model with the crossings 
of the knot as interactions. Taking the trace of the 
braid is evaluating the partition function with 
periodic (vertical) boundary conditions. 

This is more than wishful thinking. The Temperley— 
Lieb algebra arose from transfer matrices in both 
the Potts and ice-type models in two dimensions 
(Temperley and Lieb 1971) and each “e;” implements 
the addition of one more interaction to the system. 
(The same es as in the ice-type models were 
rediscovered in the subfactor context in Pimsner and 
Popa (1986)). Thus, the Jones polynomial of a closed 
braid is the partition function for a statistical mecha- 
nical model on the braid. In Jones (1983), it is observed 


that knowledge of the Jones polynomial for a family of 
links called French sinnets would constitute a solution 
of the Potts model in two dimensions. 

In Temperley and Lieb (1971), the TL relations 
are used to establish the mathematical equivalence 
of the Potts and ice-type (six-vertex) models. In 
Baxter (1982, chapter 12), this equivalence is shown 
for Potts models on an arbitrary planar graph. In 
view of this, it is not surprising that statistical 
mechanical models can be defined directly on link 
diagrams to give explicit formulas for V;(t) (and 
other invariants) as partition functions. This works 
most easily for the O-state Potts model. 

Given an unoriented link diagram D, shade the 
regions of the plane black and white and form the 
planar graph IT whose vertices are the black regions 
and whose edges are the crossings as below: 


e 
Ze 


Assign + and — to each edge according to the 
following scheme: 


P4 
Z 


Fix Q € N and two symmetric matrices w- (a, b) 
for 1<a,b<Q. The partition function of the 
diagram is then 


Zp = >, li w.(o,0°) 


states edges of T 


where a “state” is a function from the vertices of T 
to (1,2,..., O] and, given an edge of I and a state, 
c and o’ denote the values of the state at the ends of 
that edge (w, and w_ are used according to the sign 
of the edge). 

The “Potts model” is defined by the property that 
the “Boltzmann weights” w..(0,0’) depend only on 
whether o — o' or not. It is a miracle that the choice 
(with Q —2 --t-- t) 


w.(o, a’) - je if oO = a 


—] otherwise 


gives the Jones polynomial of the link defined by D 
as its partition function (up to a simple normal- 
ization). See Jones (1989) for details. 

It is natural to look for other choices of w+ which 
give knot invariants. The Fateev-Zamolodchikov 
(1982) model gives a classical knot invariant but 
besides that (and some variants on the Jones 
polynomial) there is only one other known choice of 
any interest, discovered in Jaeger (1992). In this case, 
O — 100 and the Boltzmann weights are symmetric 
under the action of the Higman-Sims group on the 
Higman-Sims graph with 100 vertices. The knot 
invariant is a special value of the Kauffman two- 
variable polynomial. 

The other side of Temperley-Lieb equivalence is 
the “ice-type” model which is a “vertex model." 
That is to say the “spins” reside on the edges of a 
graph and the interactions occur at the vertices. To 
use vertex models in knot theory, the knot projec- 
tion D itself is the (4-valent) graph. The ice-type 
model has two spin states per edge so that a state of 
the system is a function from the edges of the graph 
to the set {+}; the Boltzmann weights are given by 
two 4 x 4 matrices 1£/:(01,052,03,04) where the o’s 
are +] and w, and w_ are the contributions of 


P4 P4 
and 
"Pm mA UN, 
to the partition function, respectively. Furthermore, 
we may think of a state as a locally constant 
function c on D so for any f:{+1}—R we may 
form the term /j/(c)d0 corresponding to interac- 
tion with an external field (dÓ is the curvature or 


change of angle form on D). Then the partition 
function is 


Zp = 3» e. p /(0)d0 


states Y crossings of D 


WU (01.02.03,04) 


A (nonphysical) specialization of the six-vertex 
model yields values of f and w4 for which Zp is a 
link invariant equal to Vj (£). See Jones (1989). 

As with the Potts model, one may try to generalize 
to more general w+ and f. This is much more 
successful for these *vertex" models than it was for 
models like the Potts model. The theory of quantum 
groups (Jimbo 1986, Drinfeld 1987, Rosso 1988) 
allows one to obtain link invariants (as partition 
functions for vertex models) for each simple finite- 
dimensional Lie algebra A and each assignment of an 
irreducible representation of 9l to the components of 
the link. The images of the braid generators o; in the 
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corresponding braid group representations are called 
“R-matrices.” It is the Yang-Baxter equation that 
gives isotopy invariance of the partition function. In 
this way, one obtains (by an infinite family of one- 
variable specializations) the HOMFLYPT polynomial 
(sl,) and the Kauffman polynomial (orthogonal and 
symplectic algebras) and more polynomials. The 
geometric operation of cabling corresponds to the 
tensor product of representations. 


Connections with Quantum Field Theory 
Conformal Field Theory 


If ip is a (multicomponent) field in one chiral half of 
a two-dimensional conformal field theory (CFT), the 
correlation functions 


(e(z1)w(22) - - - P(Zn)) 


(where z; € C) are expected to be singular if z; — z; 
for some i Æ j, holomorphic otherwise and satisfy a 
linear differential equation. Thus, analytic continua- 
tion should determine a unitary monodromy repre- 
sentation of 74(C"M(2z1,22,...,z4)|z; —z; for some 
iz) on the vector space of solutions to the 
differential equation near a point. In Tsuchiya and 
Kanie (1988), these representations were calculated 
for the SU(2) WZW (Wess-Zumino-Witten) model, 
where the differential equation is known as the 
Khniznik-Zamolodchikov equation. The corre- 
sponding braid group representations were shown 
to be those obtained in the section “The original 
definition of V; (1)" and cablings thereof. 


Topological Quantum Field Theory 
In Witten (1989), the following formula appears: 


Vj (CA EL) 


= / evi; | (A A dA - 2/3AAAAA)] 
JA s> 


x [[« [resp f a) [DA] 


where A ranges over all functions from $° to the Lie 
algebra su(2), modulo the action of the gauge group 
SU(2). Also b — z/k and f runs over the components 
of the link L, to each of which is assigned an 
irreducible representation of SU(2). Parallel trans- 
port around a component j using A yields the linear 
map Pexp $4. A whose trace is constant modulo gauge 
transformations. And [SA] is a fictitious diffeo- 
morphism invariant measure on all A's modulo 
gauge transformation. 
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There are at least two ways to interpret this 
formula. 


1. As a solvable topological quantum field theory 
(TQFT) in 2 + 1 dimensions, according to Witten 
(1988) and Atiyah (1988, 1989). One is then 
obliged to expand the context and conclude that 
V; (e?7/") is defined for (possibly empty) links in 
an arbitrary 3-manifold. The TQFT axioms then 
provide an explicit formula for the invariant if the 
3-manifold is obtained from surgery on a link. In 
particular, the invariant of a 3-manifold without a 
link is a statistical mechanics type sum over 
assignments of irreducible representations of 
SU(2) to the components of the surgery link. The 
key condition making this sum finite is that only 
representations up to a certain dimension (deter- 
mined by n) are allowed. This is the vanishing of 
the Jones-Wenzl idempotent of the section “The 
Temperley-Lieb algebra." This explicit formula 
was rigourously shown to be a manifold invariant 
in Reshetikhin and Turaev (1991). For a more 
simple treatment, see Lickorish (1997) and for the 
whole TQFT treatment, see Blanchet et al. (1995). 

2. As a perturbative QFT. The stationary-phase 
Feynman diagram technique may be applied to 
obtain the coefficients of the expansion of Witten's 
formula in powers of h or equivalently 1/7. These 
coefficients are known to be “finite type" or 
Vassiliev invariants and have expressions as 
integrals over configurations of points on the link, 
see Vassiliev (1990) and Bar-Natan (1995). 


Algebraic Quantum Field Theory 


In the Haag-Kastler operator algebraic framework 
of quantum field theory (Haag 1996), statistics of 
quantum systems were interpreted in Doplicher 
et al. (1971, 1974) (DHR) in terms of certain 
representations of the symmetric group correspond- 
ing to permuting regions of spacetime. To obtain the 
symmetric group, the dimension of spacetime needs 
to be sufficiently large. It was proposed in 
Fredenhagen et al. (1989) that the DHR theory 
should also work in low dimensions with the braid 
group replacing the symmetric group, and that 
unitary braid group representations defined above 
should be the ones occurring in quantum field 
theory. The “statistical dimension” of the DHR 
theory turns up as the square root of the index of a 
subfactor (this connection was clearly established in 
Longo (1989, 1980)). The mathematical issue of the 
existence of quantum fields with braid statistics was 
established in Wassermann (1998) using the language 
of loop group representations. Actual physical systems 
with nonabelian braid statistics have not yet been 


found but have been proposed in Freedman (2003) 
as a mechanism for quantum computing. 
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Introduction 


Kolmogorov-Arnol'd-Moser (KAM) theory deals 
with the construction of quasiperiodic trajectories 
in nearly integrable Hamiltonian systems and it was 
motivated by classical problems in celestial 
mechanics such as the n-body problem. Notwith- 
standing the formidable bulk of results, ideas and 
techniques produced by the founders of the modern 
theory of dynamical systems, most notably by 
H Poincaré and G D Birkhoff, the fundamental 
question about the persistence under small perturba- 
tions of invariant tori of an integrable Hamiltonian 
system remained completely open until 1954. In that 
year, A N Kolmogorov stated what is now usually 
referred to as the KAM theorem (in the real-analytic 
setting) and gave a precise outline of its proof, 
presenting a strikingly new and powerful method to 
overcome the so-called small-divisor problem (reso- 
nances in Hamiltonian dynamics produce, in the 
perturbation series, divisors which may become 
arbitrarily small, making convergence argument 
extremely delicate). Subsequently, KAM theory has 
been extended and applied to a large variety of 
different problems, including infinite-dimensional 
dynamical systems and partial differential equations 
with Hamiltonian structure. However, establishing 
the existence of quasiperiodic motions in the z-body 
problem turned out to be a longer story, which only 
very recently has reached a satisfactory level; the 
point being that the m-body problems present strong 
degeneracies, which violate the main hypotheses of 
the KAM theorem. 

This article gives an account of the ideas and 
results concerning the construction of quasiperiodic 


solutions in the planetary n-body problem. The 
synopsis of the article is the following. 

The next section gives the analytical description of 
the planetary (1 + )-body problem. 

In the subsection “Kolmogorov’s theorem and the 
RPC3BP (1954),” original version of the KAM 
theorem is recalled, giving an outline of its proof 
and showing its implications for the simplest many- 
body case, namely, the restricted, planar, and 
circular three-body problem. 

In the section “Arnol’d’s theorem,” the existence 
of a positive measure set of initial data in phase 
space giving rise to quasiperiodic motions near 
coplanar and nearly circular unperturbed Keplerian 
trajectories is presented. The rest of the section is 
devoted to the proof of Arnol’d’s theorem following 
the historical developments: Arnol’d’s proof (1963a) 
for the planar three-body case is presented, the 
extension to the spatial three-body case due to 
Laskar and Robutel (1995) is discussed, and Her- 
man’s proof — in the form given by Féjoz in 2004 — 
of the general spatial (1 + )-case is presented. 

In the section “Lower dimensional tori,” a brief 
discussion of the construction of lower-dimensional 
elliptic tori bifurcating from the Keplerian unper- 
turbed motions is given (these results have been 
established in the early 2000s). 

Finally, the problem of taking into account real 
astronomical parameter values is considered and a 
recent result on an application of (computer- 
assisted) KAM techniques to the solar subsystem 
formed by Sun, Jupiter, and the asteroid Victoria is 
briefly mentioned. 


The Planetary (1 + n)-Body Problem 


The evolution of (1 + z7)-body systems (assimilated 
to point masses) interacting only through gravita- 
tional attraction is governed by Newton's equations. 
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If 4/ € R? denotes the position of the ith body in a 
given reference frame and if m; denotes its mass, 
then Newton's equations read 


2. (i fi fF) 
d'u!” E» u — qu) | 
F = are aero j= t a: dade n [1] 
dt? Iul -—- ul 


O<j<n 
Here the gravitational constant is taken to be equal 
to 1 (which amounts to rescale the time f). 
Equations [1] are equivalent to the standard 
Hamilton's equations corresponding to the Hamil- 
tonian function 


m;nm; 
lu GO) = ul) | 


aU 


0cicicn 


where (U"),u") are standard symplectic variables 
and the phase space is the “collisionless domain” 
M :— {U",u € R* 40 2 40 0«i2;«n).; the 
symplectic form is the standard one: X`; dU” A 
du? := F; p dU? A du\?;|-| denotes the standard 
Euclidean norm. Introducing the symplectic coordi- 
nate change (U, u) = óya4(R, r), 


"UU = pO) "uU -- ,K) 4 p’ (i zx... JH) 


dna 4 UO — RO — a RÖ, yOu RY [3] 


one sees that the Hamiltonian 71.4 :— Hnew © Dhel 
does not depend upon 7) (recall that a local 
diffeomorphism is called symplectic if it preserves 
the symplectic form). This means thar R® (= total 
linear momentum) is a global integral of motion. 
Without loss of generality, one can restrict attention 
to the invariant manifold Mọo:= {R =0} (invar- 
iance of eqn [1] by changes of inertial reference 
frames). 

In the “planetary” case, one assumes that one of 
the bodies, say i=0 (the Sun), has mass much larger 
than that of the other bodies (this accounts for the 
index “hel,” which stands for “heliocentric”). To 
make the perturbative character of the problem 
transparent, one may introduce the following rescal- 
ings. Let 


i Y RO y(t) 
mi = efi, XY = FE aa m 75 
e 0 
Tis M n) [4] 


and rescale time by a factor em’? (which amounts 
to dividing the new Hamiltonian by such a 
factor); then, the flow of the Hamiltonian 7; on 
Mo 1s equivalent to the flow of the Hamiltonian 


aa 3 xor KiMi 
Api 25 m un lac | 


{= 


— 2 
x (ow. m;m;/ms 
FE » (x xv rel — a] s) [5] 


] 1c jn 


on the phase space M :— (XU, x? € R9:1« imn 
and 04x") Æx} with respect to the standard 
symplectic form $77 , dX" A dx); the mass para- 

meters are defined as 
M; = 1+e—, Li — -—-—Á tnt [3 
mo mo + EMi 


The following observations can be made: 


1. The Hamiltonian 


n (i) |2 M: 
(0) IX"l Mi 
More = »( "m a 


f=] 


is integrable and represents the sum of n two- 
body systems formed by the Sun and the ;th 
planet (disregarding the interaction with the 
other planets). 

2. The transformation óg in eqn [3] preserves 
the total angular momentum C:— $7 4, Ux 
u“, which is a vector-valued integral for 
Hwew. Thus, the three components, C,, of 
C:= $7 , X" x x" (which is proportional to 
C and is termed the “total angular momen- 
tum"), are integrals for Hp- The integrals C, 
do not commute: if {-,-} denotes the standard 
Poisson bracket, then (Cj, C5] — C3 (and, cycli- 
cally, (C25, C3} = C4, {C3,C;} = C2). Nevertheless, 
one can form two (independent) commuting 
integrals, for example, |C\~” and C3. This shows 
that the (spatial) (1+ 2)-body problem has 
(3n — 2) degrees of freedom. 

3. An important special case is the planar (1 + 7)- 
body problem. In such a case, one assumes that 
all the “single” angular momenta C" := X” x x” 
are parallel. In this case, the motion takes place 
on a fixed plane orthogonal to C and (up to a 
rotation of the reference frame) one can take, as 
symplectic variables, X", x? € R^. The Hamilto- 
nian Hj}, governing the dynamics of the planar 
(1 + 2)-body problem is, then, given on the right- 
hand side of eqn [5] with X, x € R?. Notice 
that the planar (1 + n)-body problem has 2» 
degrees of freedom. 

4. For a deeper understanding of the perturbation 
theory of the planetary many-body problem, it is 
necessary to find “good” sets of symplectic 
coordinates, which the founders of celestial 


mechanics (most notably, Jacobi, Delaunay, and 
Poincaré) have done. In particular, Delaunay 
introduced an analytic set of symplectic *action- 
angle" variables. Recall the Delaunay variables 
for the two-body “reduced Hamiltonian” 


Let (kj, k;, k3] be a standard orthonormal basis 
in the x-configuration space; let the angular 
momentum C- X x x be nonparallel to k3 and 
let the energy E — Hye < 0. In such a case, x(t) 
describes an ellipse lying in the plane orthogonal 
to C, with focus in the origin and fixed symmetry 
axes. Let a be the semimajor axis of the ellipse 
spanned by x;2 (the inclination) be the angle 
between k3 and. C; G = |C]; O = G cos ; — C - k3; 
L —mwv Ma; be the mean anomaly of x (:=27 
times the normalized area spanned by x mea- 
sured from the perihelion P, which is the point 
of the ellipse closest to the origin); 0 be the 
angle between kı and N:= k; x C (:— oriented 
*node"); and g be the argument of the perihelion 
(:= the angle between N and (O,P)). Then 
(letting T :— R/(2xZ)) 


(L,G,O)eíL»0) x {G > 6 > 0} 


(£, g, 0) € T? u 
are conjugate symplectic coordinates and if pel 
is the corresponding symplectic map, then 
Hep 9 ópa = —(u^ M?)/(2L?). 

Note that the Delaunay variables become 
singular when C is vertical (the node is no more 
defined) and in the circular limit (the perihelion 
is not unique). In these cases different variables 
have to be used. 

TE (x, x) re Ppei((Li, Gi, Oi), (Li, gi; 0;)). Then 
Hpt expressed in the Delaunay variables 
((Li, Gj, O;), (£j, gi, 0i): 1 € i € n] becomes 


= p> M? 
— 2L? 


.4,(0) (1) (00. 
Ane = Hp4--&Hpa: Ape = 


[8] 


Note that the number of action variables on 
which the integrable Hamiltonian He, depends 
is strictly less than the number of degrees of 
freedom. This “proper degeneracy,” as we shall 
see in next sections, brings in an essential 
difficulty one has to face in the perturbative 
approach to the many-body problem. In fact, this 
feature of the many-body problem is common to 
several other problems of celestial mechanics. 
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Maximal KAM Tori 
Kolmogorov’s Theorem and the RPC3BP (1954) 


Kolmogorov’s invariant tori theorem deals with 
the persistence, in nearly integrable Hamiltonian 
systems, of Lagrangian (maximal) tori, which, in 
general, foliate the integrable limit. Kolmogorov 
(1954) stated his theorem and gave a precise 
outline of the proof. Let us briefly recall this 
milestone of the modern theory of dynamical 
systems. 

Let M:= B? x T? (B4 being a d-dimensional ball 
in R centered at the origin) be endowed with the 
standard symplectic form dyAdx:= Y dy; ^ dx; 
(y € B^, x € T^). A Hamiltonian function N on M 
having a Lagrangian invariant d-torus of energy E 
on which the N-flow is conjugated to the linear 
dense translation x — x +wt, w € R?XQ^ can be 
put in the form 


N := E + w- y + Q(y,x) 


°Q(0,x)=0, Va EN! P 
y OI erem Oo a EN’, 


lo| € 1 


(as usual |a|=a1 +- + og, wy :— es Uii, 
and Of = Om css) in such a case, the Hamiltonian 
N is said to be in Kolmogorov normal form. The 
vector w is called the “frequency vector" of the 
invariant torus [y — 0] x T7. The Hamiltonian N is 
said to be nondegenerate if 


det(02 Q(0, -)) # 0 (10) 


where the brackets denote average over I and m 
the Hessian with respect to the y-variables. 

We recall that a vector w € R^ is said to be 
“Diophantine” if there exist x » 0 and 72 d—1 
such that 


K 


zd 
gr VRE Z\{0} 11] 


w: k| > 
The set D^ of all Diophantine vectors in R? is a set of 
full Lebesgue measure. We also recall that Hamilto- 
nian trajectory is called quasiperiodic with (rationally 
independent) frequency w € R^ if it is conjugate to 
the linear translation 0 € T^ — 0 + ut € T*. 


Theorem (Kolmogorov 1954) Consider a one- 
parameter family of real-analytic Hamiltonian func- 
tions H- := N + eP where N is in Kolmogorov normal 
form (as in eqn [9]) and £s € R. Assume that w is 
Diophantine and that N is nondegenerate. Then, 
there exists ey > 0 and for any |e| € £o, a real-analytic 
symplectic transformation @-: M — M putting H. in 
Kolmogorov normal form, H-od-=N., with 
N-:= Es +w- y + Q-(y’,x’). Furthermore, |E; — E|, 
ló« —id||~2, and ||O. — Olle» are small with e. 


192 KAM Theory and Celestial Mechanics 


In other words, the Lagrangian unperturbed torus 
To:— (y 20] x T? persists under small perturbation 
and is smoothly deformed into the H.-invariant 
torus 7. :— ¢.-({y’ =0} x T4); the dynamics on such 
torus, for all |z| € £o, consists of dense quasiperiodic 
trajectories. Note that the H.-flow on 7. 
is analytically conjugated by ó. to the translation 
x’ + x' -- wt with the same frequency vector of N, 
while the energy of 7., namely E., is in general 
different from the energy E of To. 

Kolmogorov's proof is based on an iterative 
(Newton) scheme. The map 2o. is obtained 
as limp. o-..o 7, where the óU's are 
(e-dependent) symplectic transformations of M 
successively closer to the identity. It is enough 
to describe the construction of œ; 9? is 
then obtained by replacing H. with H.oó, 
and so on. The map ó'' is e-close to the identity 
and it is generated by  g(y,x):— y -x + 
e(b-x + s(x) -- y-a(x), where s and a are (resp. 
scalar- and vector-valued) real-analytic functions 
on T^ with zero average and b € R^; this means 
that the symplectic map ó'U!:(y',x')—(y,x) is 
implicitly given by the relations y=O,g and 
x' = Oyg. It is easy to see that there exists a unique 
g of the above form such that for a suitable £o > 0, 


H; o 9) = E, --w- y! + Oily, x") +e P 
V [e| € &o [12] 


with 07 Qı (0, x^) 2 0, for any a € N? and la| € 1; here, 
E1, Q1, and P, depend on e and, for a suitable c, > 0 
and for |e] < £0, |E a E;| < cılel, |o = Qi| qo < Ci |l, 
and || P; Im S €]. 

Notice that the symplectic transformation œ} is 
actually the composition of two “elementary” transfo- 
mations: Q0 =o o 94 where os); (y, x) — (n, €) 
is the symplectic lift of the T-diffeomorphism given 
by x—£-rFea(£) (ie. oU is the symplectic map 
generated by y'.£--ey'.a(£)), while o :(g, £) ^ 
(y, x) is the angle-dependent action translation gener- 
ated by n- x+ elb. x +s(x)); ot acts in the “angle 
direction" and straightens out the flow up to order 
O(<*), while ot acts in the “action direction” and is 
needed to keep the frequency of the torus fixed. 

Since H- o Q1 =:N; + €*P is again a perturbation 
of a nondegenerate Kolmogorov normal form (with 
same frequency vector w), one can repeat the 
construction by obtaining a new Hamiltonian of 
the form N + €*P». Iterating, after k steps, one gets 
a Hamiltonian N; + c? P,. Carrying out the 
(straightforward but lengthy) estimates, one can 
check that ||P,||c; € c; < c? , for a suitable constant 
c> ] independent of k (the fast growth of the 
constant c, is due to the presence of the small 


divisors appearing in the explicit construction of the 
symplectic transformations o). Thus, it is clear that 
taking sọ small enough the iterative procedure 
converges (superexponentially fast) yielding the 
thesis of the above theorem. 


6. While the statement of the invariant tori theorem 
and the outline of the proof are very clearly 
explained in Kolmogorov (1954), Kolmogorov 
did not fill out the details nor gave any estimates. 
Some years later, Arnol'd (1963a) published a 
detailed proof, which, however, did not follow 
Kolmogorov's idea. In the same year, J K Moser 
published his invariant curve theorem (for area- 
preserving twist diffeomorphisms of the annulus) 
in smooth setting. The bulk of techniques and 
theorems stemmed out from these works is 
normally referred to as KAM theory; for reviews, 
see Arnol'd (1988) or Bost (1984-85). A very 
complete version of the “KAM theorem" both in 
the real-analytic and in the smooth case (with 
optimal smoothness assumptions) is given in 
Salamon (2004); the proof of the real-analytic 
part is based on Kolmogorov's scheme. The 
KAM theory of M Herman, used in his approach 
to the planetary problem, is based on the abstract 
functional theoretical approach of R Hamilton 
(which, in turn, is a development of Nash-Moser 
implicit function theorem; see Bost (1984-85) for 
references); it is interesting, however, to note that 
the heart of Herman's KAM method is based on 
the above-mentioned Kolmogorov's transforma- 
tion à (compare Féjoz (2002)). 

7. In the nearly integrable case, one considers a one- 
parameter family of Hamiltonians Ho(I) + €H; (J, x) 
with (I,x) € M:— U x T standard symplectic 
action-angle variables, U being an open subset of 
R^. When ¢=0, the phase space M is foliated 
by Ho-invariant tori {Ig} x TŻ, on which the flow 
is given by x x-c-O,Ho(lo)t. If lo is 
such that w:=0)Ho(Io) is Diophantine and if 
det 97 Ho(Io) Æ 0, then from Kolmogorov's theorem 
it follows that the torus {Ip} x T^ persists under 
perturbation. In fact, introduce the symplectic 
variables (y, x) with y —1— Ig and let N(y):= 
Ho(lg + y), which by Taylor's formula can be 
written as Ho(lo) +w -y+ O(y) with O(y) quad- 
ratic in y and 0? O(0) < 0; Ho(Io) invertible. One can 
then apply Kolmogorovs theorem with P;(y, x) :— 
Hı (Io + y, x). 

Notice that Kolmogorov's nondegeneracy con- 
dition det O Ho(Io) #0 simply means that the 
frequency map 


I € B’ CU AI) = 0,Ho(I) [13] 


is a local diffeomorphism (B^ being a ball 
around Ip). 

8. The symplectic structure implies that if n denotes 
the number of degrees of freedom (i.e., half of the 
dimension of the phase space) and d is the 
number of independent frequencies of a quasi- 
periodic motion, then d < n; if d=n, the quasi- 
periodic motion is called maximal. Kolmogorov’s 
theorem gives sufficient conditions in order to get 
maximal quasiperiodic solutions. In fact, Kolmo- 
gorov's nondegeneracy condition is an open 
condition and the set of Diophantine vectors is 
a set of full Lebesgue measure. Thus, in general, 
Kolmogorov’s theorem yields a positive invariant 
measure set spanned by maximal quasiperiodic 
trajectories. 


As mentioned above, the planetary many-body 
models are properly degenerate and violate 
Kolmogorov's nondegeneracy conditions and, 
hence, Kolmogorov's theorem - clearly motivated 
by celestial mechanics — cannot be applied. 

There is, however, an important case to which a 
slight variation. of Kolmogorov's theorem can be 
applied (Kolmogorov did not mention this in 1954). 
The case referred to here is the simplest nontrivial 
three-body problem, namely, the restricted, planar, 
and circular three-body problem (RPC3BP for short). 
This model, largely investigated by Poincaré, deals 
with an asteroid of “zero mass" moving on the plane 
containing the trajectory of two unperturbed major 
bodies (say, Sun and Jupiter) revolving on a Keplerian 
circle. The mathematical model for the restricted 
three-body problem is obtained by taking n=2 and 
setting m2 =Q in eqn [1]: the equations for the two 
major bodies (7— 0,1) decouple from the equation 
for the asteroid (; — 2) and form an integrable two- 
body system; the problem then consists in studying 
the evolution of the asteroid w'*)(t) in the given 
gravitational field of the primaries. In the circular 
and planar cases, the motion of the two primaries is 
assumed to be circular and the motion of the 
asteroid is assumed to take place on the plane 
containing the motion of the two primaries; in fact 
(to avoid collisions), one considers either inner or 
outer (with respect to the circle described by the 
relative motion of the primaries) asteroid motions. 
To describe the Hamiltonian ?£4,, governing the 
motion of the RCP3BP problem, introduce planar 
Delaunay variables ((L,G),(£,8)) for the asteroid 
(better, for the reduced heliocentric Sun-asteroid 
system). Such variables, which are closely related to 
the above (spatial) Delaunay variables, have the 
following physical interpretation: G is proportional 
to the absolute value of the angular momentum of 
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the asteroid, L is proportional to the square root of 
the semimajor axis of the instantaneous Sun- 
asteroid ellipse, / is the mean anomaly of the 
asteroid, while $ the argument of the perihelion. 
Then, in suitably normalized units, the Hamiltonian 
governing the RPC3BP is given by 
Heep (L, G, 2,23 €):= — A —G 
+ €H,(L, G,£,g;e) [14] 


where g:— ¢—7,7 € T being the longitude of Jupi- 
ter; the variables ((L, G), (£, g)) are symplectic coordi- 
nates (with respect to the standard symplectic form); 
the normalizations have been chosen so that the 
relative motion of the primary bodies is 27 periodic 
and their distance is 1; the parameter e is (essentially) 
the ratio between the masses of the primaries; the 
perturbation Hı is the function x/?.x(! — 1/|x?) — 
x'")| expressed in the above variables, x? being the 
heliocentric coordinate of the asteroid and x!) that of 
the planet (Jupiter): such a function is real-analytic on 
{(0<G<L}x TI^ and for small £ (for complete 
details, see, e.g., Celletti and Chierchia (2003)). 
The integrable limit 


Oy. u 2 
HO) :— Hl, o = —1/(2L2) - G 
has vanishing Hessian and, hence, violates 
Kolmogorov's  nondegeneracy condition (as 


described in item (7) above). However, there is 
another nondegeneracy condition which leads to a 
simple variation. of Kolmogorov's theorem, as 
explained briefly below. 

Kolmogorov's nondegeneracy condition det? Ho 
(I9) 4 0 allows one to fix d-parameters, namely, the 
d-components of the (Diophantine) frequency vector 
w= 0,Ho(Io). Instead of fixing such parameters, one 
may fix the energy E= Họ(lọ) together with the 
direction {sw:s € R} of the frequency vector: for 
example, in a neighborhood where wy Æ 0, one can 
fix E and w;/wy for 1 € i € d — 1. Notice also that if 
w is Diophantine, then so is sw for any s Æ 0 (with 
same 7 and rescaled &). Now, it is easy to check that 
the map I € Hj! (E) — (wi/wa,...,w4 1/w4) is (at 
fixed energy E) a local diffeomorphism if and only if 
the (d -- 1) x (d 4- 1) matrix 


0; Ho O,Ho 

O,Ho 0 
evaluated at Io is invertible (here the vector 0,Ho in 
the upper right corner has to be interpreted as a 


column while the vector 0,Ho in the lower left 
corner has to be interpreted as a row). Such 
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"iso-energetic nondegeneracy" condition, rephrased 
in terms of Kolmogorov's normal forms, becomes 


(0909) v)oo ps 


Kolmogorov's theorem can be easily adapted to the 
fixed energy case. Assuming that w is Diophantine 
and that N is isoenergetically nondegenerate, the 
same conclusion as in Kolmogorov's theorem holds 
with N.:— E+w,:y + Q.-(y',x’), where w: —a&w 
and |o — 1| is small with e. 

In the RCP3BP case, the isoenergetic nondegene- 
racy is met, since 


2 (0) 
det KS 
aL cy HC | 0 


rcp 


Therefore, one can conclude that on each negative 
energy level, the RCP3BP admits a positive measure 
set of phase points, whose time evolution lies on two- 
dimensional invariant tori (on which the flow is 
analytically conjugate to linear translation by a 
Diophantine vector), provided the mass ratio of the 
primary bodies is small enough; such persistent tori 
are a slight deformation of the unperturbed “Kepler- 
ian" tori corresponding to the asteroid and the Sun 
revolving on a Keplerian ellipse on the plane where 
the Sun and the major planet describe a circular orbit. 

In fact, one can say more. The phase space for the 
RCP3BP is four dimensional, the energy levels are 
three dimensional, and Kolmogorov's invariant tori 
are two dimensional. Thus, a Kolmogorov torus 
separates the energy level, on which it lies, into two 
invariant components, and two Kolmogorov's tori 
form the boundary of a compact invariant region so 
that any motion starting in such region will never 
leave it. Thus, the RCP3BP is "totally stable": in a 
neighborhood of any phase point of negative energy, if 
the mass ratio of the primary bodies is small enough, 
the asteroid stays forever on a nearly Keplerian ellipse 
with nearly fixed orbital elements L and G. 


Arnol’d’s Theorem 


Consider again the planetary (1 + n)-body problem 
governed by the Hamiltonian Hpr in eqn [5]. In the 
integrable approximation, governed by the Hamil- 
tonian quo the z planets describe Keplerian ellipses 
focused on the Sun. Arnol'd (1963b) has stated the 


following theorem. 


Theorem (Arnold 1963b) Let ¢>0 be small 
enough. Then, there exists a bounded, H,\,-invariant 
set F(e) C M of positive Lebesgue measure corre- 
sponding to planetary motions with bounded 
relative distances; F(0) corresponds to Keplerian 


ellipses with small eccentricities and small relative 
inclinations. 


This theorem represents a major achievement in 
celestial mechanics solving more than tri-Centennial 
mathematical problem. Arnol’d (1963b) gave a 
complete proof of this result only in the planar 
three-body case and gave some indications of how to 
extend his approach to the general situation. 
However, to give a full proof of Arnol'd's theorem 
in the general case turned out to be more than a 
technical problem and new ideas were needed: the 
complete proof (due, essentially, to M Herman) has 
been given only in 2004. 

In the following subsections, we briefly review 
the history and the ideas related to the proof of 
Arnol'd's theorem. As for credits: the proof of Arnol'd's 
theorem in the planar 3BP case is due to Arnol'd himself 
(Arnol'd 1963b); the spatial 3BP case is due to Laskar 
and Robutel (1995) and Robutel (1995); the general 
case is due to Herman (1998) and Féjoz (2004). The 
exposition we have given does not always follow the 
original references. 


The planar three-body problem Recall the Hamil- 
tonian Hpn of the planar (1-4-z)-body problem 
given in item (3) of the section *The planetary 
(14- 2)-body problem." A convenient set of sym- 
plectic variables for nearly circular motions are the 
*planar Poincaré variables." To describe such vari- 
ables, consider a single, planar two-body system 
with Hamiltonian 


2 
XT E XecR?, 
2L |x] 


(with respect to dX A dx) [16] 


04x ER?’ 


and introduce — as done before formula [14] for 
NS — planar Delaunay variables ((L, G), (£, g)) 
(here, g — $ — argument of the perihelion). To remove 
the singularity of the Delaunay variables near zero 
eccentricities, Poincaré introduced variables 


((A, n), (A, £)) defined by the following formulas: 
A=, dH om-——L-— 
A= f+, 


V2H cosh = 7 
V2H sinh = € 


As Poincaré showed, such variables are symplectic and 
analytic in a neighborhood of (0,00) x T x {0,0}; 
notice that the symplectic map ((A, n), (A, €)) ^ (X, x) 
depends on the parameters pz, M, and e. In Poincaré 
variables, the two-body Hamiltonian in eqn [16] 


— 
° 17 


becomes —4/(2A2), with &:— (u/mo) /M. Now, 
re-insert the index i, let à;:((A;, m), (Ai, £)) — (X), 
x?) and o(A, I). A, £) -= (i (A1, Ms A1, €1 P rey @n(Ans 
Nns Ans &n))» Then, the Hamiltonian for the planar 
(1 + n)-body problem takes the form 


pln o = Ho(A) T EHI (A, A, T]. E) 


2. qycomp! princ 


[13 33 - Í 
where the so-called “complementary part" Hi"? 
and the "principal part" *( ^ of the perturbation 


are, respectively, the functions 


(i). yi) pip; — 1 
P» X? .X'? and M à ei <x] [19] 
SI<ISn IXicjen 0 


expressed in Poincaré variables. 

The scheme of proof of Arnol'd's theorem in the 
planar, three-body case (one star, » — 2 planets) is as 
follows. The Hamiltonian is given by eqn [13] with 
n=2; the phase space is eight dimensional (four 
degrees of freedom). This system, as mentioned several 
times, Is properly degenerate and Kolmogorov's 
theorem cannot be applied directly; furthermore, a 
full (four-dimensional) set of action variables needs 
to be identified. 

A first observation is that, in the planetary model, 
there are "fast variables” (the As describing the 
revolutions of the planets) and “secular variables” 
(the 7;’s and €;’s describing the variations of position 
and shape of the instantaneuous Keplerian ellipses). 
By averaging theory (see, e.g., Arnol'd (1998)), one 
can "neglect," in nonresonant regions, the fast-angle 
dependence up to high order in & obtaining an 
effective Hamiltonian, which, up to O(e7), is given 
by the "secular" Hamiltonian 


Hee = Hol A) + EH (A, 7, €) 


Hi(A,n, €):= | Y: A - 


“Nonresonant region" means, here, an open A-set 
where O\Ho-k Z0 for k € ZF, Iki| + |k5| € K and 
for a suitable K > 1. 

In order to analyze the secular Hamiltonian, we 
shall beriefly consider Hı as a function of the 
symplectic variables 7 and £, regarding the “slow 
actions” A; as parameters. 

For symmetry reasons, Hı is even in (7, £) and the 
point (7, €) = (0,0) is an elliptic equilibrium for H1: 
the eigenvalues of the matrix S7, Ha (A, 0, 0), 
S being the standard symplectic matrix, are purely 
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imaginary numbers { 3-iQ4, +1022}. The real numbers 
{Q;} are symplectic invariants of the secular Hamil- 
tonian and are usually called first (or linear) Birkh- 
off invariants. In a neighborhood of an elliptic 
equilibrium, one can use Birkhoff's normal form 
theory (see, e.g., Siegel (1971)) if the linear 
invariants (1,92) are nonresonant up to order r 
(Le. if Q-k:= Qiki +Qk2 #0 for any REZ 
such that |kı| + |k2| <r) then one can find a 
symplectic transformation óp;, so that 


ny + & 


Hi o vir = FU. Jai) on J= 


[21] 
where F is a polynomial of degree [r/2] of the form 
Q4J1 + Q2J2 + (1/2)MJ -J +--+, M —.M(A) being a 
(2 x 2) matrix (and o,/|J|? — 0 as |J| — 0). Arnol'd, 
using computations performed by Le Verrier, 
checked the nonresonance condition up to order 
r=6 in the asymptotic regime a;/a2 — 0 (where aj 
denote the semimajor axes of approximate Kepler- 
ian ellipses of the two planets); these computations 
represent one of the most delicate parts of the paper. 

Thus, combining averaging theory and Birkhoff 
normal form theory, one can construct a symplec- 
tic change of variables defined on an open 
subset of the phase space (avoiding some linear 
resonances) (A,A,7,€)—(A’,’,J,y~), where + 
ig; = J/2]J; exp (ipj), casting the three-body Hamil- 
tonian into the form 


Ho(A’) + €(Q(A’) -J + 4M(A’)J - J) 
+F (A, J) +P F2(A',N,J, p) 
= HolA', J; 8) +P F3(N , XN, J, v) [22] 


for a suitable prefixed order p > 3; notice that the 
nonresonance condition needed to apply averaging 
theory is not particularly hard to check since it 
involves the unperturbed and completely explicit 
Kepler Hamiltonian Ho. The idea is now to consider 
c", as a perturbation of the completely integrable 
Hamiltonian Ho and to apply Kolmogorov's theo- 
rem. Finally, one can check the Kolmogorov's 
nondegenearcy condition, which since 


det Oy.) Ho(A'.J’s2) = e) ((det HG) det M + O(e)) 


amounts to check the invertibility of the matrix M. 
Such a condition is also checked in Arnol’d (1963b) 
with the aid of Le Verrier’s tables and in the 
asymptotic regime a;/a2 — 0. 


The spatial three-body problem In order to extend 
the previous argument to the spatial case, Arnol'd 
suggested connecting the planar and spatial case 
through a limiting procedure. Such strategy presents 
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analytical problems (the symplectic variables for the 
spatial case become singular in the planar limit), 
which have not been overcome. However, the 
particular structure of the three-body case allows 
one to derive a four-degree-of-freedom Hamiltonian, 
to which the proof of the planar case can be easily 
adapted. The procedure described below is based on 
the classical Jacobi's reduction of the nodes. 

First, we inroduce a convenient set of symplectic 
variables. Let, for. i= 1,2, (Lr GƏ), (Zj,g;,8;)) 
denote the Delaunay variables introduced in items 
(5) and (6) above: these are the Delaunay variables 
associated to the two-body system, Sun-ith planet. 
Then, as Poincaré showed, the variables ((A7, A7), 
(nt, €), (Oi, 6;)), where 


AT = Dy 
Ai = Lit gi 
[23] 
9; = y A2(L; = G;) cosg; 
& = =y 2L = Gj)sing; 


are symplectic and analytic near circular, non- 
coplanar motions; for a detailed discussion of these 
and other sets of interesting classical variables, see, 
for example, Biasco et al. (2003) and references 
therein; the asterisk is introduced to avoid confusion 
with a closely related but different set of Poincaré 
variables (see below). Let us denote by 


Happ = HO(A*) + eH? (A*, A*, gf, €*, 0,0) 


the Hamiltonian equation [8] (with n= 2) expressed 
in terms of the symplectic variables 
((A*, A*), (77", £*), (0,0), A* = (A1, A5), etc. Recalling 
the physical meaning of the Delaunay variables, one 
realizes that O4-- O5 is the vertical component, 
C4 2 C- ks, of the total argument C=C") + CP, 
where C denotes the angular momentum of the ;th 
planet with respect to the origin of an inertial 
heliocentric frame {k1, k2, k3}. This suggests that the 
symplectic variables can be introduced: 


(A*,A*, 7°, €°, UV, wv) = p(A*, A*, 7°, €*, 9, 8) 
with (V, V5, 4, V2) :— (01, O1 + O2, 01 — 02, 02). 
Let 
H3bp — H3bp Oo ó | 


denote the Hamiltonian of the spatial three-body 
problem in these symplectic variables. Since the 
Poisson bracket of V; = O4 + O2 and 745, vanishes 
(C3 being an integral for the 7(35,-flow), the 
conjugate angle v» is cyclic for H3pp» that is, 


H3bp e" TA A SIT Ve Wi, V5, 1) 


Now (because the total angular momentum C 
is preserved), one may restrict attention to the 
ten-dimensional invariant (and symplectic) submani- 
fold Mye, defined by fixing the total angular 
momentum to be vertical. Such submanifold is 
easily described in terms of Delaunay variables; in 
fact, C -kı =0=C - k is equivalent to 


001—60;—- and G*—07—G$-0$ [24 
Thus, M. :— ġ(Mver) is given by 


M er e" ts = 1, V, = Pi (A*,1*, v;)| 


with 
g, M2, M - HD = (Mj - Hi) 
"T: OW 
ne tS 
E 2 
Since .Mi.. is invariant for the flow o! of 


Typs Vi (£) — c and i1 =Q for motions starting on 
Mže» which implies that (3v, H3bp)| ye. — O0. This 
fact allows one to introduce, for fixed values of the 
vertical angular momentum V» =c Æ 0, the follow- 


ing reduced Hamiltonian 


€ (A*A m", &*) 
_ Hao UA s^ 39 E a V1(A^,m,£ c), 0,7) 


on the eight-dimensional phase space Mea :— (A7 > 0, 
A € T?, (n*,€*) € B^) endowed with the standard 
symplectic form dA* A dA* + dy* A dé* (B^ being a 
ball around the origin in R^). In fact, the (standard) 
Hamilton's equations for Hf g are immediately recog- 
nized to be a subsystem of the full (standard) 
Hamilton's equations for 713p, when the initial data 
are restricted on M7... and the constant value of V» is 


chosen to be c. More precisely, if the Hamiltonian flow 
of HE q on Mig is denoted by ¢/, then 


h, (n. Vi(A*,m" ,£*;c), c, m, da) 
= (¢4(e"), (1), c, m, nt) 25] 


where we have used the shorthand notations: 
z“ = (A A v, y = M rea; V, (t) — Vo $L(z*); w(t) —- 
y + fo Ov, 3bp(Pe(2")» V1(s), c, 7r)ds. At this point, 
the scheme used for the planar case may be easily 
adapted to the present situation. The nondegeneracy 
conditions have been checked in Robutel (1995) where 
indications, based on a computer program, have been 
given for the validity of the theorem in a wider set of 
initial data. 

Notice that the dimension of the reduced phase 
space of the spatial case is 8, which is also the 
dimension of the phase space of the planar case. 
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Therefore, also the Lagrangian tori obtained with 
this procedure have the same dimension of the tori 
obtained in the planar case (i.e., four). 


The general case Consider the general case follow- 
ing the strategy of M Herman as presented by Féjoz 
(2004), to which the reader is referred for complete 
proofs and further references. 

The symplectic variables used in Féjoz (2004), to 
cope with the spatial planetary (1 + )-body prob- 
lem (Sun and z planets), are closely related to the 
variables defined in eqn [23]. For 1 €; € n, let 
((L;, Gi Oj), (£i, g;, 0;)) denote the Delaunay variables 
associated with the two-body system, Sun-ith 
planet. Then (as shown by Poincaré) the variables 
(Aj, Aj), (mis £i), (Pi, 4;)), where A; = L;, A; — £; + ; + 8;, 
and 


ni = V 2(L; — Gi) cos(g; + 6;) 


E; s 2L; — G;) sin(g; F- 0,) 


2(G; -— 0i) COS 0; 


[26] 
pi = 
gie 


are symplectic and analytic near circular, non- 
coplanar motions (see, e.g., Biasco et al. (2003)). Let 


Hnbp = HO (A) +EH (A, A m £, 5,4) — [27] 


2(Gj = s sin 0; 


denote the Hamiltonian (eqn [8]) expressed in terms 
of the Poincaré symplectic variables ((A, A), (7, £), 
(P, q)), A= (A1, sung An), etc. 

As the number of the planets increases, the 
degeneracies become stronger and stronger. Further- 
more, a clean reduction, such as the reduction of the 
nodes, is no more available if n > 2. To overcome 
these problems Herman proposed a new approach, 
which is described below. , 

Instead of Kolmogorov's nondegeneracy assump- 
tion — which says that the frequency map [13] 
] —^w(I) is a local diffeomorphism - one may 
consider weaker nondegeneracy conditions. In 
particular, in Féjoz (2004), one considers non- 
planar frequency maps. A smooth curve u € A— 
w(u) € Rf, where A is an open nonempty interval, 
is called “nonplanar” at uo € A if all the u-derivatives 
up to order (d — 1) at uo, w(uo), w (uo), . .. ,w (4 P (ug) 
are linearly independent in Rf; a smooth 
map zcACcCR?—cv(u)e R^. p<d, is called 
nonplanar at uo E A if there exists a smooth 
curve Q:À — A such that wo y is nonplanar at to € 
A with (to) — 49. A S Pyartli has proved (see, e.g., 
Féjoz (2004)) that if the map u € A C R^ — w(u) € R7 
is nonplanar at uo, then there exists a neighborhood 


B C A of uy and a subset C C B of full Lebesgue 
measure (i.e., meas(C) = meas(B)) such that w(u) is 
Diophantine for any u € C. The nonplanarity condi- 
tion is weaker than Kolmogorov's nondegeneracy 
conditions; for example, the map 
É 
w(I):= Oy e + ÉD +h + n) 


= (Id +2hh + I,11, 1,1) 


violates both Kolmogorov's nondegeneracy and the 
isoenergetic nondegeneracy conditions but is non- 
planar at any point of the form (1,,0,0,0), since 
w(I1,0,0,0) 2 (I3, I7, I5, 1) is a nonplanar curve (at 
any point). 

As in the three-body case, the frequency map is 


that associated with the averaged secular 
Hamiltonian 
Hsec = H(A) + eH" 
dA [28] 


7(1) - (1) 
H (^, 7]. E&P, q) A J H (27)” 


which has an elliptic equilibrium at 7=£=p=q=0 
(as above, A is regarded as a parameter). It is a 
remarkably well-known fact that the quadratic part 
of H'") does not contain “mixed terms," namely, 


«tO -u t €(Qoin 1): LI Qin: E + Qspt P |p 
tO. q:q- O4) [29] 


where the function m" and the symmetric matrices 


Qpin and Q, depend upon A while O4 denotes 
terms of order 4 in (7, £, p,q). The eigenvalues of the 
matrices Q n and Q,, are the first Birkhoff 
invariants of H (with respect to the symplectic 
variables (5,£,p,4)). Let 0j,...,04, and ,..-5% 
denote, respectively, the eigenvalues of Qj, and 
Q,»; then the frequency map for the (1 + 1)-body 
problem will be defined as (recall eqn [18]) 


A — (å, e9) [30] 


with 


" (s 3) 
QJ i= —. ee — 
MPUUN T 


IE (o,s) := CEOS PE: -i IE AGF, oV e 


Herman pointed out, however, that the frequencies 
c and ç satisfy two independent linear relations, 
namely (up to renumbering the indices), 


n 


» (cis) =0 [32] 


i=] 


a = p 


which clearly prevents the frequency map to be 
nonplanar; the second relation in eqn [32] is usually 
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called *Herman resonance" (while the first relation 
is a well-known consequence of rotation invariance). 
The degeneracy due to rotation invariance may 
be easily taken care of by considering (as in the 
three-body case) the (6n — 2)-dimensional invariant 
symplectic manifold A,4, defined by taking the 
total angular momentum C to be vertical, that is, 
C-k; =0=C-k). But, when n > 2, Jacobi's reduc- 
tion of the nodes is no more available and to get rid 
of the second degeneracy (Herman's resonance), the 
authors bring in a nice trick, originally due — once 
more! — to Poincaré. In place of considering Hnbp 
restricted on Myer, Féjoz considers the modified 
Hamiltonian 
Hobs = Map toC; Cyr C-kg=|C| [33] 


n 


where ó € R is an extra artificial parameter. By an 
analyticity argument, it is then possible to prove that 
the (rescaled) frequency map 


(A,6) > (W,o4,. 2494) € R^" 


is nonplanar on an open dense set of full measure 
and this is enough to find a positive measure set of 
Lagrangian maximal (3m — 1)-dimensional invariant 
tori for Mes but, since Mate and 74,5, commute, a 
classical Lagrangian intersection argument allows 
one to conclude that such tori are invariant also for 
Hnbp yielding the complete proof of Arnol’d’s 
theorem in the general case. Notice that this 
argument yields (37 — 1)-dimensional tori, which in 
the three-body case means five dimensional. Instead, 
the tori found in the section “The spatial three-body 
problem” are four dimensional. The point is that 
in the reduced phase space, the motion of the 
nodeline — denoted as y2(t) in eqn [25] — does not 
appear. 

We conclude this discussion by mentioning that 
the KAM theory used in Féjoz (2004) is a modern 
and elegant function-theoretic reformulation of the 
classical theory and is based on a C* local inversion 
theorem (F Sergeraert and R Hamilton) on *tame" 
Frechet spaces (which, in turn, is related to the 
Nash-Moser implicit function theorem; see Bost 
(1984—85)). 


e. 0g. S1,- 


Lower Dimensional Tori 


The maximal tori for the many-body problems 
described above are found near the elliptic equilibria 
given by the decoupled Keplerian motions. It is 
natural to ask what happens of such elliptic 
equilibria when the interaction among planets is 
taken into account. Even though no complete 
answer has yet been given to such a question, it 


appears that, in general, the Keplerian elliptic 
equilibria “bifurcate” into elliptic n-dimensional 
tori. This section presents a short and nontechnical 
account of the existing results on the matter (the 
general theory of lower-dimensional tori is, mainly, 
due to J K Moser and S M Graff for the hyperbolic 
case and V K Melnikov, H Eliasson, and S B Kuksin 
for the technically more difficult elliptic case; for 
references, see, e.g., Chierchia et al. (2004)). 

The normal form of a Hamiltonian admitting an 
n-dimensional elliptic invariant torus 7 of energy E, 
proper frequencies 2 € R”, and “normal frequen- 
cies" Q € R? in a 2d-dimensional phase space with 
d — n + p is given by 

i E EE 
N:—E-ó, y+ 20 s 


! 


[34] 


Here the symplectic form is given by dy ^ dx + 
dn ^ d£,y € R",x € T",(n,£) € R"^; T is then given 
by T:={y=0} x {7=£=0}. Under suitable assump- 
tions, a set of such tori persists under the effect of a 
small enough perturbation P(y,x,7,&). Clearly, the 
union of the persistent tori (if n < d) forms a set of 
zero measure in phase space; however, in general, 
n-parameter families persist. 

In the many-body case considered in this article, 
the proper frequencies are the Keplerian frequencies 
given by the map A—w(A) (eqn [31]), which is a 
local diffeomorphism of R". The normal frequencies 
Q, instead, are proportional to € and are the first 
Birkhoff invariants around the elliptic equilibria as 
discussed above. Under these circumstances, the main 
nondegeneracy hypothesis needed to establish the 
persistence of the Keplerian n-dimensional elliptic tori 
boils down to the so-called Melnilkov condition: 


GOEG- WiFi [35] 


Such condition has been checked for the planar 
three-body case in Féjoz (2002), for the spatial 
three-body case in Biasco et al. (2003) and for the 
planar n-body case in Biasco et al. (2004). The 
general spatial case is still open: in fact, while it is 
possible to establish lower-dimensional elliptic tori 
for the modified Hamiltonian Tous in [33], it is not 
clear how to conclude the existence of elliptic tori 
for the actual Hamiltonian 71,5, since the argument 
used above works only for Lagrangian (maximal) 
tori; on the other hand, the direct asymptotics 
techniques used in Biasco et al. (2003) do not 
extend easily to the general spatial case. 

Clearly, the lower-dimensional tori described in 
this section are not the only ones that arise in 
n-body dynamics. For more lower-dimensional tori 
in the planar three-body case, see Féjoz (2002). 


Physical Applications 


The above results show that, in principle, there may 
exist "stable planetary systems" exhibiting quasiper- 
iodic motions around coplanar, circular Keplerian 
trajectories — in the Newtonian many-body approx- 
imation — provided the masses of the planets are 
much smaller than the mass of the central star. 

A quite different question is: in the Newtonian 
many-body approximation, is the solar system or, 
more in generally, a solar subsystem stable? 

Clearly, even a precise mathematical reformula- 
tion of such a question might be difficult. However, 
it might be desirable to develop a mathematical 
theory for important physical models, taking into 
account observed parameter values. 

As a very preliminary step in this direction, consider 
one of the results of Celletti and Chierchia (see Celletti 
and Chierchia (2003), and references therein). 

In Celletti and Chierchia (2003), the (isolated) 
subsystem formed by the Sun, Jupiter, and asteroid 
Victoria (one of the main objects in the Asteroidal 
belt) is considered. Such a system is modeled by an 
order-10 Fourier truncation of the RPC3BP, whose 
Hamiltonian has been described in the section 
“Kolmogorov’s theorem and the RPC3BP (1954)." 
The Sun—Jupiter motion is therefore approximated by 
a circular one, the asteroid Victoria is considered 
massless, and the motions of the three bodies are 
assumed to be coplanar; the remaining orbital 
parameters (Jupiter/Sun mass ratio, which is approx- 
imately 1/1000; eccentricity and semimajor axis of the 
osculating Sun—Victoria ellipse; and “energy” of the 
system) are taken to be the actually observed values. 
For such a system, it is proved that there exists an 
invariant region, on the observed fixed energy level, 
bounded by two maximal two-dimensional Kolmo- 
goroy tori, trapping the observed orbital parameters of 
the osculating Sun—Victoria ellipse. 

As mentioned above, the proof of this result is 
computer assisted: a long series of algebraic compu- 
tations and estimates is performed on computers, 
keeping a rigorous track of the numerical errors 
introduced by the machines. 
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Introduction 


In most physical cases, the evolution of a system of N 
indistinguishable interacting particles Xy = (x1, x2, ..., 


XN) with velocities Vy = (v1, v2, ... , Un) is described by 
a Hamiltonian system 
dXy — OH(XN, VN) 
dt |  ÓVw " 
dVn OH(XN, VN) 
dt XN 


in the phase space RÊN x RIN. When N becomes 
large, it is natural to consider replacing the above 
discrete phase space by a continuous phase space 
of dimension 1 € d < 3, R? x R? and to introduce 
a measure f(x,v,t) that describes the density of 
particles which, at the point x € R^ and at time t, 
have velocity v. This measure may also be 
interpreted as a generalization of the empirical 
measure 


AN(t 2X Ó,(1) ).v;( 


N Zen 


defined in the phase space Rf x R? by the above 
system of N particles. In this way, one constructs a 
link between the microscopic and the macroscopic 
descriptions. The macroscopic physical quantities 
are, for instance, the first moments of this density: 


p(x, t) >j f(x,v,t)dv (density) 


p(x, t)u(x, t) =| vf (x,v,t)dv (momentum) 


PT 
p(x, t) E(x, t) = | te v,t)dv (energy) 


Kinetic theory studies the intermediate stage shown 
in Figure 1. 

Its first successes were related to classical thermo- 
dynamics and in particular to the molecular hypoth- 
esis. The contributions of Maxwell (1860, 1872) 
and of Boltzmann (1867) led to the “Boltzmann” 


Hamiltonian Systems as Kinetic equations $ Macroscopic equations 


Figure 1 Illustration of the role of kinetic equations in linking 
microscopic and macroscopic properties. 


equation, described in the companion article of 
Mario Pulvirenti (see Boltzmann Equation (Classical 
and Quantum)). In 1905, Lorentz used the same 
point of view to describe the motion of electrons in a 
metal. However, the different physical context leads 
to some basic differences between the Boltzmann 
equation and the Lorentz equation. The Boltzmann 
equation is derived under the assumption that the 
driving forces result from collisions between pairs of 
molecules. Therefore, the problem is nonlinear with 
a quadratic nonlinearity. In the Lorentz model the 
driving force is the interaction of the electrons with 
the atoms of the metal, which remain fixed. 
Collisions between electrons are ignored, so that 
the Lorentz equation is linear. 

The most general form of a kinetic equation is as 
follows: 


Of (x, v, t) + V,Hy - Vxf (x, v, t) 
— VH; - Vof (xv, t) = C(f) [2] 


The term C(f) represents the effect of interactions 
either between particles or with the background. 
Without this term, the eqn [2] is reduced to the 
classical Liouville equation 


Of (x, v, t) + VH, - Vxf (x, v, t) 
— VH; Vif (x, v,t) 20 [3] 


which says that the function f is transported by the 
flow of the Hamiltonian H;(x, v). This Hamiltonian 
depends on the model and may involve the unknown 
function f itself. In the simplest case H(x,v) = |v|^ /2, 
eqn [3] and its solutions are given by 


Of (x,v,t) --v- Vyf(x,v,t) - 0 
fx, s t) = f(a = vt, 0,0) [4] 


Nowadays kinetic equations appear in a variety of 
sciences and applications, such as astrophysics, 
aerospace engineering, nuclear engineering, particle- 
fluid interactions, semiconductor technology, social 
sciences, and biology, for example in chemotaxis 
and immunology. 

They are used first to model phenomena and then 
to obtain a qualitative and quantitative description 
of situations involving sufficiently many particles so 
as to prohibit any computation at the level of 
particles, and yet the medium is still too rarefied to 
allow the use of macroscopic equations. As detailed 
in the next section, a macroscopic description 
requires that the function f(x,v,t) be close to local 
thermodynamical equilibrium. For classical and 
quantum Boltzmann equations (see Boltzmann 


Equation (Classical and Quantum)) these equilibria 
are either Maxwellian, Bose-Einstein, or Fermi- 
Dirac distributions. 

Several effects, especially the influence of the 
boundary, may prevent the system from reaching 
local thermodynamical equilibrium and, therefore, 
even in macroscopic descriptions, kinetic equations 
may still be used to take into account the effect of 
the boundary. In this case, the term “Knudsen 
boundary layer" is currently used. 

Finally, one should keep in mind that there exist 
some macroscopic phenomena which cannot be 
deduced from the corresponding microscopic phys- 
ics by the mediation of a kinetic equation. Once 
again, returning to the companion article (see 
Boltzmann Equation (Classical and Quantum)) one 
observes that, since the only equilibria are Maxwel- 
lian, the macroscopic equations are those describing 
perfect gases. A real gas with a nontrivial van der 
Waals law is *too dense" to be explained by this 
theory. The alternative seems to go directly from the 
microscopic direction to the macroscopic descrip- 
tion. This is a subject which is still under investiga- 
tion and for which the reader may consult Olla et al. 
(1995). 


Kinetic Equations Entropy 
and Irreversibility 


At the level of particles, the basic laws of physics are 
reversible. Yet these same laws are not reversible 
when seen at the level of a macroscopic description. 
This lack of reversibility is measured by the decay of 
entropy (mathematicians prefer convex functions; 
therefore, the mathematical entropy considered in 
this contribution is the negative of the physical 
entropy, and with irreversibility it decays). The 
kinetic equations lie in ‘between, as shown in 
Figure 1; the decay of entropy should appear along 
one of the two arrows of this diagram. 

Since the appearance of irreversibility is related to 
loss of information and averaging, it should be 
driven by a “mixing” process. 

In general two mechanisms are responsible for 
such effects: 


l. an ergodic or a relaxation mechanism by which a 
process averages itself; and 

2. the introduction of some external random param- 
eter. Observable quantities are then defined as 
averages over that parameter. 


It seems important to compare these two “pro- 
cesses.” This will be illustrated below with the most 
classical examples of the theory. 
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The Diffusion Limit for the Neutron Transport 
Equation 


Equations very similar to the one introduced by 
Lorentz are used to describe the interaction of neutrons 
with atoms in a nuclear reactor: this is the reason why 
these types of equations are often called neutron 
transport equations. An important issue is the deriva- 
tion of a macroscopic diffusion equation. Assuming 
that neutrons are not subject to acceleration effects, 
considering the problem with constant modulus of 
velocity (|v| = 1), introducing a “small” parameter € 
which here corresponds to the absorption of the 
medium, one can study the following simplified model: 


On, Tv. Vf. 
a(x) / / / 
— — & k , € d = 5 
(f Jaa (wvf) =0 (5| 


€ 


In [5] one assumes, for the kernel k(v,v’), the 
following properties: 


k(v,v') = k(v',v), 0 < k(v.v) 
/ k(v.v)dv = 1 [6] 
J |v’ |=1 


Vv. v. 


and denotes by K the operator 


f 5 Kf = k(v, vf (v')dv' 


J [v |-1 


In the simplest case (say without boundary) eqn [5] 
is well-posed both for positive and negative time 
but hypothesis [6] has the following important 
consequences: 


1. For positive time, it defines, for each e> 0, a 
contraction semigroup in any L? space and, there- 
fore, the sequence of solutions or a subsequence 
thereof converges, say weakly, to a limit f(x, v, t). 

2. One also observes that v — 1 is (up to a multi- 
plicative constant) the only solution of the equation 


f-Kr-fo- | 


| |—1 


k(v,v)f(v)dv-0 [7] 


Therefore, the €! in front of the collision term 
forces the limit f(x,v,t) to be independent of v. 
In this simple problem, this is the thermodyna- 
mical equilibrium. 


Dividing by € and integrating over |v| = 1 gives the 
relation 


O; f.(x,v,t)dv 


+ Vx -| vf.(x,v,t)dv = 0 [8] 
(v|—1 
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Now using the Fredholm alternative implies the 
existence and uniqueness of a function v — (v) such 
that 


Ate) — J _ kv.) dv 


= fi, (v )dv' = 0 [9] 
|v" |=1 


Multiply eqn [5] by G(v) and integrate over |v| = 1 to 
obtain 


. a(x) 
lim 2&2 J, _ (ERA whee. du 
= lim ries B(v)(I — K)f.(x,v,t)dv 


eo0 € lv|[-1 


= -lim V, B(v) & vf.(x, v, t)dv [10] 


Since the operator (I— K) is self-adjoint non- 
negative, with 0 as the leading eigenvalue, the 
matrix 


D= B(v) & vdv 
|v|=1 


= J B(v) & (I — K)B(v)dv 


is positive definite, and one finally obtains the 
diffusion equation 


E! 
c(x) 


af- vs var) =0 11] 


The above derivation is an example of what is called 
the “moments method." It is implicit even in the 
papers of Maxwell. It has been systematically used 
in several domains: 


e To understand the relation between the Boltzmann 
equation and the Euler and  Navier-Stokes 
equations (Golse 2005); 

e To compute the critical size of a nuclear assembly. 
One shows that this size is well approximated by 
the size of the domain for which the Laplacian, 
with appropriate boundary conditions, has lead- 
ing eigenvalue 0. It is for the spectral analysis of 
this problem that the averaging lemma (see the 
section “Some specific mathematical tools") was 
derived. | 

e To analyze the macroscopic limit for the solution 
of the radiative transfer equations, which describe 
the propagation of the intensity of photons in a 
large class of phenomena ranging from stellar 
atmospheres to the cooling of glass, including 


optical tomography in biomedical imaging. In a 
simplified form, the so-called *grey model," these 
equations can be reduced to 


€0,1.(x,v,t) +v: VyI(x,v,t) 
1 1 f j l | 
+ eos], I.(x,v ,t))dv ) (I(x, v,t) 


- — Iioc 2:0) ) dv’) — 12 

An lv |=1 | | 
In contrast to the previous example, the problem 
is, in many cases, nonlinear. The opacity o is a 
positive function that depends on the intensity Ie 
through 


- 1 
Is, t) = al I. (x, v^, t) dv 
47 |v'|-1 


and which goes to oo with I, going to zero. The 
moments method can be applied with the aver- 
aging lemma, and one shows that the limit of I, is 
a function that is independent of v and satisfies 
the following degenerate parabolic equation: 


1 
Ol — Vx to v.) =p [13] 
This equation is similar to the one obtained in the 
description of porous media and contains the 
following information: for initial data I(x, 0) with 
compact support, in contrast to the behavior of 
solutions of the standard diffusion equation, the 
solution I(x,t) remains compactly supported in x. 
The boundary of this support is the thermal front 
and for a finite time, up to saturation (by water in 
porous media, by reacted deuterium in laser- 
confined fusion), this front remains fixed. 


What made the analysis of the above macroscopic 
limit simple was the existence of an € > 0 dependent 
process which, for vanishing e, forces the solution to 
converge to a “thermodynamical” equilibrium. The 
irreversibility was already present in the first arrow 
of Figure 1. This is what made the analysis of the 
second arrow simple. The subtleties of the appear- 
ance of the irreversibility in the first arrow may be 
well explained by the next examples. 


The Linear Billiard Model 


In the absence of an external electric field, the model 
proposed by Lorentz could be viewed as a limit of a 
system of particles evolving freely between spherical 
obstacles and reflecting on these obstacles according 
to the law of geometric optics. Along these lines, 
two types of results have been proved in two space 
variables. 


In 1973, Gallavotti considered the case where the 
obstacles are randomly spaced under a Poisson 
configuration and proved the following theorem: 


Theorem 1 Consider obstacles(balls) of radius e 
and center ci. Assume that the probability of finding 
exactly N such obstacles in a bounded measurable 
set A C R is given by the “Poisson law" 


N 
P(den) — e "Al a dei dc» "oo dcn [14] 


with 


u 


CN=C1,@,...,CN and He == [15] 
€ 


Denote by E' the expectation witb respect to the 
above Poisson distribution. For given c and cw 
introduce 


Oey = RV Ur<ien {|x — cil < e) [16] 
and fey, the solution of the problem 
Orfey.(X,U,t) +> Vefexe(X, v, t) = 0 
in Qus xS [17] 


with specular reflection on the boundary and 
v-independent initial data: 


fa, c(x,0,0) = G(x) i One XS [18] 
Then 
h.(x,t,Q) = E'[fey,cl [19] 


converges weakly for t > 0 to the solution of the 
transport equation 


Of (x,v,t) +v- Vf(xiv,t) -u af. V) 
-— ; | fe D - 2 = 0 [20] 


f(x,v,0) = ó(x) in R? x S! [21] 
The situation is completely different when the 
obstacles are periodically spaced, a situation which 
seems closer to Lorentz's original idea. Golse (2003) 
(and previous contributions quoted in this article) 
obtained the following result: 


Theorem 2 Assume that the obstacles are periodi- 
cally spaced and conveniently scaled, defining the 
domain 


O,-RAU(Ix-gse) pA 
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Then there exists a family of continuous uniformly 
bounded initial data such that no subsequence 
extracted from the family of solutions of 


Afete Vl =0 iO, S [23] 


with specular reflections on the boundary, converges 
to solutions of equations of the type [20]. 


This pathology is related to the existence of 
particles that can travel freely for a very long time 
before meeting the obstacles, and the proof with 
some arithmetic (Diophantine approximations and 
continued fractions) relies on the analysis of such 
trajectories. 

A comparison between the Theorems 1 and 2 
shows that the ergodic property of the free flow on 
the periodic lattice is not strong enough to lead to a 
collisional kinetic equation unless some complemen- 
tary randomness is introduced. 

The examples of this section should be compared 
with the rigorous derivation of the Boltzmann 
equation by Lanford (see Boltzmann Equation 
(Classical and Quantum)). The reader should 
observe that this derivation corresponds to the 
same type of scaling (finite mean free path). 
However, no extra randomness is needed in this 
case. The proof uses the fact that configurations 
leading only to a finite number of binary collisions 
are of full measure. This corresponds to an 
ergodicity property which is enforced by the fact 
that the problem is genuinely nonlinear. 


Mean-Field Scaling and Vlasov Equations 


The neutron transport equation is devoted to the 
interaction with obstacles and the Boltzmann 
equation to binary collisions. A simpler situation 
from the mathematical point of view corresponds 
to the case where each particle is under the action 
of the average of all other particles. Then the name 
“mean field limit” is used. The simplest example is 
the derivation of a Vlasov-type equation from a 
system of N classical particles interacting with a C* 
potential V(|x|). The following Hamiltonian is 
used: 


Hixi,... XUI... VN) 
-y bh. 1 y vans A 
Iken ^ 2N | TREN 


and the name mean-field scaling is related to the 
factor N ! before the potential. Assuming that the 
particles are undistinguishable, one introduces 
the joint probability density Fy = Fx(x1,..., XN, 
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Ui,...,UN) in the N-particle phase space, which 
satisfies the Liouville equation 


Q,FN + {Hn, Fy} = OFN + >, (vk Va, FN 


1<k<N 
1 
-— M. Va(V(x xi) 
1</JZR<N 
x Vy, Fx) =f [25] 


From [25], with the notations 


Ka = (S]e xal, Fa = (pg « wu g S 
Xa = (Xaydar cir AN) Win = eed...) IN 


one deduces an infinite hierarchy of equations for 
the marginals 


F(X, V,,t) = J fu (Xu, Vy, t)dX”, dV? 
for 1<n <N, F5z0for N <n: 


O,F" (Xy, Vm t) + X v, V FN (Xn, Vnt) 


] in 
1 
— 3. Va (Va V(lxi — xj) Fu (Xn, Vn t)) 
Mou 
CX Vs J Vx; V([x; — x") 
1<i<n 
x EE, Wn xt art, 1 =f [26] 


Letting N go to infinity, one obtains “formally,” for 
the distribution functions, 


F" = lim Fy 


N—20o 


the Vlasov hierarchy: 


O,F' (Xas Vnt) + Va Vx,F (Xp Vas E) 
- Y wal ff v.vts-s 
1<i<n 
x BE X, Vma", o", t)d" do =0 [xr 
Observe that for any density F(x,v,t) that satisfies 
i F(x,v,t)dxdv=1, F(x,v,t) 20 [28] 


and is a solution of the V potential Vlasov equation: 


OF (x,v,t) +v+VF(x,v,t) 


- (f [ Vave- x*| F(x", 2:1 


x Vanl, v, fy = 0 [29] 


the factorization formula 


F(X, Vat) = [| Fint) [30] 


1<i<n 


defines a solution of the above Vlasov hierarchy. 

A uniqueness argument implies that any solution 
of the Vlasov hierarchy which is factorized at time 
zero will remain factorized at any subsequent time. 
Such a property, also observed for the hierarchy 
leading to the Boltzmann equation, is called the 
propagation of chaos. To make the proof rigorous, 
one has to analyze the limiting process in the 
hierarchy and prove the uniqueness of the solution 
of the infinite hierarchy. For a smooth potential, this 
has been done by Braun and Hepp in 1977 and by 
Spohn in 1981. An interesting approach consists, 
following Dobrushin, in introducing the Wasserstein 
distance; see Golse (2003) for a detailed exposition. 

In the case of the Vlasov—Poisson equation [29] 
with V(|x|) - 1/4z|x| the potential turns out to be 
too singular for the above derivation. In particular, 
the corresponding solution of the N-particle pro- 
blem is not uniformly defined. However, for the 
corresponding equation (and for variants thereof, 
including the effect of the magnetic field, the 
Vlasov-Maxwell system) a series of mathematical 
results concerning existence and stability of solu- 
tions have been obtained. An excellent recent 
exposition of these results can be found in the 
book of Glassey (1996). 

Equation [29] as well as the original system turns 
out to be fully reversible. Neither irreversibility nor 
averaging has appeared in the limit process which 
corresponds to the first arrow of Figure 1; this is due 
to the “weak coupling.” Therefore, irreversibility 
should now appear on the second arrow. Integrating 
eqn [29] with respect to v gives the relation (often 
called Fick’s law): 


Q,p(x, t) + Vx J vF(x,v,t)dv = 0 [31] 


But now expressing the current j= f vF(x,v,t)dv in 
terms of macroscopic variables turns out to be a 
difficult issue in the absence of a “relaxation” effect. 
Up to now there has been no derivation of such 
macroscopic equations from first principles. 

The same type of problems exist for the two- 
dimensional Euler equation, which is in some sense 
very similar to the Vlasov equation. It has been 
observed that these equations develop for “turbulent 
initial data” a kind of “mixing process" leading to 
coherent structures that would play the role of 
thermodynamical equilibrium (in the absence of 
relaxation). The Jupiter red spot is the most 
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well-known example of such a structure. These 
coherent structures are obtained by maximizing an 
entropy which does not come directly from the 
dynamics but which is inspired by similar problems 
in statistical mechanics. Finally, one has to take into 
account in this construction the existence of an 
infinite set of conserved quantities: for each regular 
function G, vanishing at infinity, one has 


all G(F(x, v, £))dx dv — 0 


This approach was already started by Onsager in 1945 
and pursued by many scientists. A recent reference is 
the article of Chavanis and Sommeria (1998). 


Derivation of Kinetic Equations from the 
Schródinger Equation 


Oscillatory solutions of the Schródinger equation, 
with wavelength of the order of the Planck constant, 
tend to behave like particles. This is described in 
detail by different tools of high-frequency approxi- 
mation. In particular, the limit of the Wigner 
transform of the density vx, t) & vy, t): 


1 by 
W(x.£.t =] ev (xar) 
" a(x - 2 jo (32] 


is a solution of a Liouville equation. Therefore, one 
should expect that in the presence of “many” 
obstacles (“many potentials”) the limit should be 
given by a kinetic equation. As shown by the 
previous section the introduction of randomness 
seems compulsory in reaching this goal. 

Consider a big cube A= Az of size L in R?. Let 
w= (x4),0—1,2,..., N denote the configuration of 
random obstacles distributed uniformly in A. The 
density of obstacles is p — N/L? and the expectation 
with respect to this uniform measure is denoted by 


Eom I] (1 ff dx, 
l<a<N : 


With V(|x|) a smooth, short-range potential, the 
random potential created by the obstacles is 


V(x) = X. V(Ix—xal) 
l<a<N 


then one of the typical results (low-density limit, 
which corresponds to the quantum version of 
Gallavotti classical result) obtained, reads as follows: 


Theorem 3 (Erdós and Yau 1988) Assume that the 
density of obstacles is p=poe with a fixed po. 


Denote by w*(t) the solution of the Schrödinger 
equation 


iO, = —1 As, + Vut, (33] 


with initial condition localized and oscillating at the 
scale e, that is, with b and S smooth 


Vf (0) = e"? b(ex) exp (i 22 [34] 
Consider the density matrix pt (t,x,y) — vt (t, x) & 
v, (t, y) and its Wigner transform 


W^ (x, €, t) 


city ge T g [35] 
Joe dns um +) dy 


(2r) 


Then for any t > 0, EW* (t) converges weakly with e 
going to zero to a solution F(t) of the kinetic equation 


O,F(t,x,£) - £- V,F(t,x,£) 
" J ITE, e)Pdef — IENE, x, e) 
— F(t, x, £))d£' [36] 


where T is the amplitude of the scattering operator 
associated to the Schrödinger equation with the 
short range potential V. 


The proof uses several ingredients including 
scattering theory with expansion in term of Dyson 
series; see Erdös and Yau (1998). 


Semiconductor Modeling 


In modern computers, the electronic devices are so 
small that the electric current may have no space/time 
to reach a thermodynamical equilibrium. Therefore, 
this turns out to be a field where the kinetic equations 
are the most naturally used. Details of what can be 
deduced from a mathematical analysis can be found 
in Poupaud (1994). The equations involve the 
distribution of electrons fe(x,k,t) and holes 
f(x, k, t) and have the following form: 


cQ fe (t, x, k) + velk)Vxfelt, x, k) 


= 7VxU(t,x) z Vf (t, x, k) 


= HQ.) + Re fi) tx. E) (37 


EOnf, (t, x, k) + vy (R)V xf (t, x, k) 


- 5 VxU(t, x) : Vf (t, x, k) 


- LO. xs) + Ralfa fe) (E) [38] 
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The variable k ranges over a torus B of R? which, in 
physics books, carries the name of Brillouin zone. 
The velocities of propagation of electrons and holes 
are determined in terms of the energy band by the 
formula 


] 
Veh =F V pce h(R) [39] 


The potential U is determined in terms of the doping 
profile C(x), the conductivity €, and the density of 
electrons and holes according to the formula 


—A Ut, 2) — X. (ct) -m / f(t, x, k)dk 


b 
HE / ftx. dk) 40) 


Finally Qen and Ren are binary integral operators in 
the variable & € B which model collisions and 
generation-recombination processes. Concerning 
the “mathematical approach” the situation is as 
follows. 

The relations [39] can be deduced from the high- 
frequency analysis of the solution of the Schrédinger 
equation 


— he rs 
Woo, = — Abt V(z)v [41] 


with V a periodic potential constructed on the dual 
lattice of B. The method uses the Bloch decom- 
position of the solution and the Wigner series 
(Poupaud 1994). No mathematical derivation of 
the collisions operator is currently available. The 
situation should be compared to what is said in the 
section “Derivation of kinetic equations from the 
Schródinger equation,” but in a much more 
complicated setting. 

On the other hand, the collision operators Qen 
and Ren, as given by phenomenological arguments, 
have enough good relaxation properties to allow a 
rigorous limit of the system [37|-[38] for € going to 
zero (Poupaud 1994). This leads to the justification 
of the so-called drift-diffusion models and to the 
possibility of constructing correctors (with respect to 
c) and to treating the effect of heterojunctions by 
boundary layer analysis. 


Some Specific Mathematical Tools 


Few proofs were given in the above exposition and 
details would not be suitable for a review article. 
However, the mathematical approach to kinetic 
equations has generated some new tools, and it 
may be useful to give the most prominent ones. 


The Averaging Lemma 


Compactness results appear in spectral theory and in 
the construction of solutions of nonlinear equations 
(whenever strong convergence is needed for the 
limit). Being hyperbolic, the transport operator 
v: Vx. propagates singularities along characteristics. 
Therefore, at first sight it seems hopeless that one 
might obtain any regularizing effect from the free 
streaming part of a kinetic model. The key to 
obtaining regularizing effects from the transport 
operator v- Vx is to seek those effects not on the 
number density itself, but on velocity averages 
thereof; in other words, on the macroscopic densities. 

Here is the prototype of all velocity averaging 
results. 


Theorem 4 Let F, be a bounded family in L?(R4 x 
R^). Assume that the family v - VF, is also bounded 
in L2(R¢ x R). Then, for each $ € I?(R^), the 
family of moments p,.(x) defined by 


p(x) = | | F (x v)o(u)de 


is relatively compact in L*(R°). 


For the proof one starts with the expression 
G, =F; - v- V,F, takes the Fourier transform with 
respect to x of this relation and writes for 9,(£) the 
expression 

J 
a= [ 546200008 d 
JR + 1v.€ 


Then use the Cauchy-Schwarz inequality to obtain 


. vx, a2 
412 < jr J G (£ 9M^d 43 
[ôd « (f, sere. glee um pa 


and complete the proof by standard arguments. 

The averaging lemma was first observed by 
Agoshkov (1984) for abstract results concerning 
the regularity of solutions of kinetic equations in 
domains with boundary. Independently, it was 
rediscovered in the improved form given above by 
Golse, Perthame, and Sentis (1985) and used for the 
spectral theory in the diffusion approximation. The 
extension to L^, p > 1, spaces and to L! (with use of 
entropy estimate) were instrumental in proving the 
validity of the Rosseland approximation for the 
radiative transfer equations and for the proof of 
existence by Lions and Di Perna of renormalized 
solutions of the Boltzmann equation. A more refined 
result needs to be used to establish the incompres- 
sible limit of the solutions of the Boltzmann 
equations; see Golse (2005) for details and a 
complete list of references. 


The Dispersive Property 


Consider for the solutions in RxR of the 
elementary kinetic equations 


Of +v-Vif =0, f(xv,0)—f'(xv) [44] 
the local density 
pr ij = L f (x. v,t)dv [45] 


From the relation 


pix. —. | Fv 
= | f?(x — vt,v,t)dv 
R, 


« | sup |f?(x — vt, w)|dv [46] 
R 


d 
we RA? 


deduce with an elementary change of variable the 
following estimate, which carries the name of 
dispersion lemma, 


1 
iaiia jej? IP Mo aec ae) (47, 
From interpolation and duality arguments follows: 


Proposition 1 The macroscopic density p defined 
by [45] satisfies the inequality 


Pll ra (Rico (RÀ) s C(d) ||P |l, sea, [48] 
for any choice of real numbers a, p, and q such that 
TIT Z2 d 
CP CH—-i1" 4$ ii 
Ph (9 
1<2= E s < E 
=" P541 2d—1 


The values 4— 1, p — 1, and q— oo are obvious. 
The other limiting values are the interesting ones. 
They are given by p — d/(d — 1), that is, p — d' then 
q —2 and a — 2d/(2d — 1). 

These inequalities carry the name of Strichartz 
inequalities because they are very similar to classical 
inequalities obtained by Strichartz for the solution of 
the free Schródinger equation. This should not be 
surprising since the Wigner transform of the densities 


1 —]1yv 
fix, v,t) = =] e u(x + 5 y, t) 


@ w(x — Fy, t)dy [S0] 


then turns out to be a solution of the transport 
equation 


Of +v- Vf =0 [51] 


Kinetic Equations 207 


However, the estimates for kinetic equations are not 
easily translated into estimations for the Schrödinger 
equation because the properties of the initial data in 
terms of norms cannot be simply estimated in terms of 
the inverse Wigner transform. Spaces with Fourier 
transform in L^, p Z 2, are not easy to characterize and 
not natural for the Schródinger equation. The above 
estimates have been very useful in analyzing the large- 
time behavior of solutions and also in proving the 
regularity of the three-dimensional Vlasov equation. 


The Entropy and Entropy Dissipation 


For solutions of the Boltzmann equation the 
Boltzmann H function 


H(f) = f. " f(x,v) log f(x,v)dx dv 


decreases in time and the same is true for the 
relative entropy to an absolute Maxwellian M(v) = 
(22) 32e-W 2. 


H(FIM) — i. (r In @ aif 4. M ) dx du 


This leads to the systematic introduction in the theory 
of the notion of relative entropy. It turned out to be 
instrumental in proving relaxation toward equilib- 
rium of solutions of kinetic (or similar) equations 
and for the analysis of hydrodynamical limits. 

A striking example considered by Desvillettes and 
Villani is the linearized Fokker—Planck equation in 
any space dimension: 


&F 4 v. VF — V,V(x) - VF 
= V,(V,F + Fv) [52] 


When x — V(x) is a smooth potential strictly convex 
at infinity, this system has a unique steady state 
given by the relation 
Fs (x, ese T Mp) =e m sd [53] 
xo (x,v) =e ""M(v)—-e '"*'—— à 
For any solution of [52] one has 
F 
O,H(F|M) + J F|V,log—| dxdv-—0 [54] 
RH x RH M 


which says that the entropy dissipation is the 
relative Fisher information (with respect to v) of F. 
Now, to study the relaxation to equilibrium, one 
uses the logarithmic Sobolev inequality: 
H(F|M) < Ji F|V, lo : 'dxdv [55] 
E. 2 RIRI v 5M 
Details, references, and extensions can be found in 


Arnold et al. (2004). 
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Conclusions 


Kinetic equations have been studied since the end of 
the nineteenth century, both from the physical and 
mathematical points of view, but it seems that since 
the middle of the last century the interest in this 
approach has considerably increased. 

The fact that these equations are well adapted to the 
description of media which have not “thermalized” 
(because they are too rarefied or because the domain 
where they evolve is too small) has been a basic reason 
for their use in many applied fields; to the ones already 
quoted one may add the analysis of the air between the 
reading head and a compact disk, the computations of 
the characteristics of an ionic motor, and many others. 

As a consequence, mathematical progress has 
been very important. Without going into the details, 
this contribution is focused on this, and in particular 
on what can be obtained by the deterministic 
approach and where the introduction of randomness 
seems compulsory. 

The kinetic formulation turned out to be well 
adapted to large-scale computers, in particular with 
Monte Carlo simulations. One should observe that 
the point of view of modern functional analysis 
contributes stability estimates to the understanding 
and improvement of numerical methods. For an 
introduction to such numerical methods, the reader 
should first concentrate on the Boltzmann equation 
itself, which has been one of the basic motivations; 
consult the book of Sone (2002) the references 
therein and in particular the book of Bird (1994). 


See also: Boltzmann Equation (Classical and Quantum); 
Breaking Water Waves; Einstein's Equations with Matter; 
Fourier Law; Interacting Stochastic Particle Systems; 
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Introduction 


A knot homology is a theory which assigns to a knot 
K (or link L) in S? a graded homology group whose 
graded Euler characteristic is a knot polynomial 
associated to K. In all known examples, the knot 
polynomials in question are specializations of the 
HOMELY polynomial Pg(a, q), which we take to be 
determined by the skein relation 


aP(X) —a'P(®) =(q—q )P(9Q) [1] 
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and normalized so that P of the unknot is equal to 1. 
Let Px(K) be the specialization of Px given by 


Pxu(K) = Px(q’,q) n 


Then for each N > 0, there is a bigraded knot 
homology H;/(K), which satisfies 


Px(K) = V (—1)'q' dim HX (K) [3] 


ij 


We refer to the first grading i as the homological 
grading, and the second grading j as the polynomial 
or q-grading. 

The idea of a knot homology was introduced by 
Khovanov (2000) in a seminal paper, in which he 
defined the homology theory corresponding to the 


Jones polynomial (N — 2). In subsequent work, he 
defined such a theory for N —3, and then, in 
collaboration with Rozansky, for any N »9O. 
Recently, the two authors have introduced a triply 
graded homology theory 7(^^*(K) whose graded 
Euler characteristic gives the entire HOMFLY 
polynomial: 


Pk(a,q) 5 (-1yqia dimH^^(K) ^ (4 
ijk 
All of these theories are combinatorial in nature. 

In contrast, the knot homology for N — 0 arises 
from a very different source — the Heegaard Floer 
homology of Ozsváth and Szabó. This theory traces 
its roots back to invariants of 3- and 4-manifolds 
defined using Seiberg-Witten and Donaldson theory. 
The definition of Ho(K) is not combinatorial, but 
because of its connections with these invariants, the 
theory is known to carry a good deal of geometric 
information. about the knot K. The interplay 
between the two apparently different sorts of knot 
homologies (N > 0 and N—0) has enhanced our 
understanding of both sides. 

This article will mostly focus on the cases N —0 
and N=2, which are the oldest and best-studied 
examples of knot homologies and are related to the 
two best-known specializations of the HOMFLY 
polynomial — the Alexander and Jones polynomials. 
We have chosen to use a uniform notation to 
emphasize the similarities between theories, but the 
reader should be aware that other notation is more 
common in the literature. Ho is often referred to as 
the knot Floer homology (written HFK), and is 
usually normalized with a polynomial grading of 
ï} = į/2, corresponding to the substitution 1 — q*, 
which gives the standard normalization of the 
Alexander polynomial. H; is generally called the 
reduced Khovanov homology, and often denoted by 
Kh, or Khyeg. 


Construction 


Seen from a distance, all knot homologies are 
defined in much the same way. Given a knot K, we 
must first choose some additional data D which 
give a concrete geometric presentation of the knot. 
Using this data, we write down a bigraded chain 
complex (Cy/(D),dx). This complex depends on 
our initial choice of D, but when we take 
homology, we are left with groups Hx/(K) which 
are invariants of the knot K (cf. the simplicial 
homology of a topological space X, where the 
chain groups depend on the choice of some initial 
geometric data — a triangulation of X — but the 
homology groups are invariants of X). 
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In all cases, the generators of Cyn(D) correspond 
naturally to terms which appear in a classical model 
for computing Px(K). In other words, we can write 


PN(K) 2 V (-1) 94/0 [5] 


ccs 


where the sum runs over a set of states S determined 
by D, and the functions 7 and j are also determined 
by D. Cx/(D) is the free abelian group generated by 
lo € S[i(oc) —1,](c) —-j] and the differential dy is 
chosen to preserve the j-grading: j(dwx)-—j(x). It 
follows that Cy(D) decomposes into an infinite 
direct sum of complexes, one for each value of j, and 
[3] is a consequence of [5]. 

Beyond these global similarities, the definition of 
CN(D) varies with the value of N. In the second half 
of the article, we give explicit details of the 
constructions for N — 0 and N — 2. 


Filtered Complexes and Deformations 


An important characteristic shared by all the Cy’s is 
the existence of deformations with homology Z. 
Recall that (Cn(D), dyn) is a graded chain complex: 
i(dnx) —j(x). By a deformation of such a complex, 
we mean a new chain complex (Cy(D), dn + d) in 
which the underlying group remains the same, but 
the differential has been perturbed by the addition of 
a new term dy which strictly raises the j-grading: 
j(d'y(x)) > jlx). 

Any deformation of a graded complex is naturally 
a filtered complex, and as such, gives rise to a 
spectral sequence. The Eo term of this spectral 
sequence is the original unperturbed complex 
(Cx(D),dwN), so the underlying group of the EF, 
term is just Hyx(K). Thus, it is independent of the 
choice of initial data D. In fact, it can often be 
shown that all terms in the spectral sequence beyond 
the first one are invariants of K. This is known to 
be the case for N=0 and N — 2, and is most likely 
true for all other N as well (cf. the Leray-Serre 
sequence associated to a fibration, where the first 
two terms depend on a choice of geometric data but 
the E; and higher terms are all invariants of the 
fibration). 

For each value of N, Cn(D) admits a natural 
deformation whose homology is Z in homological 
grading 0, and zero in every other grading. When 
N — 0,2, the filtration grading of this generator is 
known to be an invariant of K. (This is probably the 
case for N » 2 as well.) Equivalently, this is the 
j-grading of the surviving copy of Z in the spectral 
sequence. When N — 0, this invariant is convention- 
ally normalized to be half the j-grading of the 
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generator, and is called 7(K). When N=2, it is 
called s(K). 


Geometric Properties 


Some elementary properties of the Hy’s generalize 
those of the HOMFLY polynomial. If Kj4z:K5 
denotes the connected sum of Kı and K3, then over Q 


An(Ki#K2) = HN(K1) & Hn(K2) (6) 
and if K is the mirror image of K, 
Hy (K) = Hy (K) 7 
Moreover, Hi satisfies an additional symmetry 
Hi (K) = Hj ^ (K) [8] 


generalizing the symmetry of the Alexander poly- 
nomial: Po(q) — Po(q !). (With integer coefficients, 
these equalities all hold at the chain level. The 
correct statements about the homology can be 
obtained from the Kunneth formula and universal 
coefficient theorem.) 

Hy (K) also contains deeper information related to 
the genus of surfaces bounding K. If K is a knot in 
S, recall that g(K) — the Seifert genus of K — is the 
minimal genus of an orientable surface smoothly 
embedded in $? and bounding K. If we view $° as 
the boundary of the 4-ball B*, we can define a 
second quantity g,(K) — the slice genus — by relaxing 
the requirement that the surface be embedded in S? 
and instead requiring it to be embedded in B*. 

Both s(K) and 7(K) give lower bounds on the slice 
genus of K: 


Ir(K)| € g.(K) [9] 


Is(K)| € 2g. (K) [10] 


These bounds are far from independent. In fact, in 
all known examples, s(K)=27(K). It is an open 
problem to determine whether this is true for all 
knots. 

From [6], it follows that s and 7 are additive 
under connected sum. Thus, both invariants define 
homomorphisms from the concordance group of 
knots in $? to Z. The inequalities in eqns [9] and [10] 
are not always sharp, but there is one case where 
equality is known to hold. This is when K is 
represented by a diagram with all positive crossings 
(or, more generally, K is quasipositive.) In this case, 
the slice genus is also equal to the Seifert genus, and 
all three are easily computed using Seifert's 
algorithm. 

The proof of [10] depends on the fact that 
for N > 0, Hy is functorial in the following sense. 


If S c S? x [0,1] is a smoothly embedded, orientable 
cobordism between links Lı and L5, then for each 
N >Q, there is an induced map oy: HN(L1) > 
Hy(L2). ?*» is a graded map: it preserves the 
homological grading, and lowers the j-grading by 
(N — 1)x(S). Under deformation, it becomes a 
filtered map which induces a rational isomorphism 
on the deformed homologies. 


Ho and Heegaard Floer Homology 


The proof of [9] depends on the close connection 
between the knot Floer homology and the Heegaard 
Floer homology. Roughly speaking, the Heegaard 
Floer groups of 3-manifolds obtained by surgery on 
K are determined by the groups Hj'(K) together 
with additional differentials obtained by relaxing the 
requirement that n,(¢)=n,(@)=0. The relation 
with the slice genus again arises by studying maps 
induced by cobordisms, but in this case, the relevant 
cobordism is the surgery cobordism between $? and 
the O-surgery on K. 

This connection also leads to another important 
property of Ho: it detects the Seifert genus. If we let 
M(K) be the largest value of j for which the group 
Hy" (K) is nontrivial, then 


M(K) — 2g(K) [11] 


This fact generalizes a well-known inequality invol- 
ving the degree of the Alexander polynomial: if 
m(K) is the largest power of g appearing in Po(K), 
then m(K) € 2g(K). 


Computations 


The difficulty of computing HX(K) varies with the 
value of N. When N-— 1, the theory is essentially 
trivial: H ^ (K) ~ 7. for any knot K, and all other 
groups vanish. Of the remaining knot homologies, 
H>(K) is the easiest to compute. The theory for 
alternating knots was worked out by E S Lee, and 
extensive calculations have also been made for 
nonalternating knots using computer programs 
written by Bar-Natan and Shumakovitch. 
Computing Hy is more difficult, on account of the 
noncombinatorial nature of do. Three families of 
knots for which Ho is well understood are alternat- 
ing knots, (1,1) knots (described in the next section), 
and knots which admit lens space surgeries. Beyond 
this, there is an array of techniques which may or 
may not work in any given case. The best of these is 
probably a setup introduced by Ozsváth and Szabó, 
in which the generators of Co(D) correspond to 
states in the Kauffman state model of the Alexander 
polynomial. Combining this method with the known 


results for alternating knots and (1,1) knots gives a 
fairly good understanding of Ho(K) for knots with 
10 or fewer crossings; for larger knots, relatively 
little is known. 

Few computations of Hy for N » 2 have been 
made, although the definition in this case is purely 
combinatorial. 


Thin and Thick Knots 


For simple knots, both Ho and H» are thin. This 
means that there exists a constant cN(K)(N — 0, 2) 
such that H x/(K) is trivial unless j — 2i — cN(K). In 
such cases, we necessarily have co(K) — 27(K) (resp. 
co(K)=s(K)), and Hx(K) is completely determined 
by cn(K) and Py(K). The relationship is best 
expressed in terms of the Poincaré polynomial of 
Hn(K): 


Pn(K) = 2. tiq! dim H}! (K) 
= (A ORA [m] 


If K is an alternating knot, both Ho(K) and H2(K) 
are thin, and co(K) =c2(K) — o(K). (Note that in this 
case the bound on g,(K) coming from 7 and s 
coincides with the classical bound coming from the 
signature. Many nonalternating knots are thin as 
well; in all examples in which both groups have 
been computed, either both Ho(K) and H2(K) are 
thin, or neither is. In addition, all such knots appear 
to have co(K) ^ c3(K) — e(K). 

Those knots whose homologies are not thin are 
called thick. There are a dozen such knots with ten 
or fewer crossings: using the standard numbering in 
the knot tables (see, e.g., Rolfsen (1976)) these are 
819, 942, 10124, 10158, 10132, 10136, 10139,10145, 10152, 
10153, 10154, and 10161. It is a curious and as yet 
unexplained coincidence that, for all of these knots, 
the ranks of Ho(K) and H3(K) are equal. 

There is an analogous notion of thinness when 
N > 2, but there exist alternating knots for which 
Hy cannot be thin for N > 0 (this can be seen from 
the HOMFLY polynomials). 


Construction of Ho 


We now turn to a more detailed description of the 
definition of Ho(K). The geometric data D used to 
define Co is a Heegaard diagram for the complement 
of K. One convenient way to specify such a diagram 
is by a doubly pointed Heegaard diagram of S?. The 
data for such a diagram consist of a surface X of 
genus g, two g-tuples of attaching circles [o1,..., ag} 
and {(;,...,8,} on X, and two points zw € X 
which are disjoint from all the a’s and 5s. Each set 
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Figure 1 Heegaard splitting of S? corresponding to the 
standard decomposition of S? into two solid tori. 


of attaching circles is composed of g disjoint simple 
closed curves, arranged so that when X, is cut along 
them the result is a sphere with 2g holes. Any such 
set of attaching circles determines a unique genus-g 
handlebody H with boundary X and the property 
that each attaching circle bounds a disk in H. 

The choice of œ and 8 curves determines the 
underlying 3-manifold in which the knot is 
embedded. Starting with © x [0,1], we fill in 
one component of the boundary with the handle- 
body determined by the o-curves, and the other 
component with the handlebody determined by the 
B-curves to obtain a closed 3-manifold. By hypoth- 
esis, this manifold is required to be $°. A simple 
Heegaard diagram of S? with g—1 is shown in 
Figure 1. 

To go from a doubly pointed Heegaard diagram 
to a diagram of the knot complement, we remove 
neighborhoods of z and w and replace them with a 
tube to get a surface X’ of genus g + 1. We also add 
an additional a-handle o4,1, which runs from z to w 
in X in such a way that it does not intersect the 
other o's, and then comes back over the tube. This 
process is illustrated in Figure 2. 

A Heegaard diagram of S?^— K determines a 
presentation of «(S^ — K) with one generator x; 
for each a-circle and one relator wy; for each /3-circle. 
To find the relator wj, one travels along 5j; 
recording each intersection with some o; by append- 
ing x?! to the relator. The sign is determined by the 
sign of the intersection. As an example, consider the 
two doubly pointed diagrams of Figure 3, both of 
which correspond to the same Heegaard diagram of 
S?. (It is isotopic to the one shown in Figure 1.) The 
fundamental groups of the associated knot comple- 
ments can be read off from the corresponding genus- 
2 Heegaard splittings. Starting from the point where 


o )-(to» 


Figure 2 Going from a doubly pointed diagram to a Heegaard 
diagram of the knot complement. 
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(a) (b) 

Figure 3 Doubly pointed Heegaard diagrams for the unknot 
and the trefoil. Opposite sides of the square are identified to form 
a torus. The dotted line represents ao. 


3, intersects the left-hand side of the square and 
moving to the right, we get 


71 (S? — Kı) = (x1, xa|xi1x ! x1 esq 
71 (S? — K2) = (x1, x2|x2x1x7 a 1x5 5x1 1 
The first group is isomorphic to Z, and the knot in 
Figure 3a is the unknot. The second is isomorphic to 
71 of the complement of the trefoil knot, and in fact 
the knot in Figure 3b is the left-handed trefoil. 
The definition of Co(D) is based on a classical 
method for computing the Alexander polynomial 
known as the Fox calculus, which takes as its input 


a presentation of z,($? — K). According to Fox 
calculus, 


Po(K) = q” det(ds,.w;j)1-;;-, [13] 
Here d,,w; is an element of the group ring 
Z[H: (S? — K)] = Zig? 


It is determined by the following rules: 


dX; = Ôij [14] 
d,,ab = dy,a + |a|d,,b [15] 
dx," ==] [16] 


where 
l-l mS — KK) + H,(S? —K) = Z=(q*) [17 


is the abelianization map. The factor of +q” is chosen 
so that Po(K)(1) 2 1 and Po(K)(q) =Po(K) (q^). 

As an example, consider the two presentations 
above. In the first presentation, | - | sends x; to 1 and 
x2 to q^, so 


dy (xix1!x1) 21— TEN + Berg” 
af 441 
=] (18) 


which is the Alexander polynomial of the unknot. If 
we abelianize the relator in the second presentation, 
we see that |x1| = |x2| ^ q^, so 


M M 
dy, (sxx x4 Xz xi) 


= |x2| — |x2x1x3 x7 | + |xoxix;'xi xi | — [19] 


—-q'—-1-4^* [20] 


which is the Alexander polynomial of the trefoil. 

When g=1, the complex Co(D) is generated by 
the points of a; N 81. These intersection points may 
be naturally identified with the appearances of the 
generator x; in w1, and thus with the monomials 
appearing in d,,w . For example, the three mono- 
mials which appear on the right-hand sides of eqns 
[18] and [19] correspond, respectively, to the points 
labeled p1,p2, and p3 in Figure 3. The j-grading of 
each generator is given by the exponent of g which 
the corresponding monomial contributes to the 
Alexander polynomial. Thus, all three generators in 
Figure 3a have j-grading 0, while in Figure 3b, the 
generators p1, p», and p3 have j-gradings 2,0, and —2 
respectively. 

For general g, the monomials appearing in the 
determinant of eqn [13] correspond to intersection 
points of the two totally real tori à —o, x =- x og 
and B= i x --- x f, inside the symmetric product 
Sym*X. The knot Floer homology is the Lagran- 
gian Floer homology of o and @ inside the 
symplectic manifold Sym*(X — z — w). The genera- 
tors of C)(D) are the points of aM; the 
differential is defined by counting holomorphic 
disks with boundary on a and 5. To be precise, for 
xc€can f, 


dox = bj» 


oc "3 (x.y),u(9)—1 
nz (o) nu ó)—-0 


#M(¢)y [21] 


Here m(x, y) denotes the set of homotopy classes of 
maps of the strip D = {a + ib | b € [0, 1]} into Sym*X 
which take the right-hand boundary to o and the 
left-hand boundary to 9, and which limit to x as 
b — —oo and to y as b — oc. (o) denotes the formal 
dimension of the space of pseudoholomorphic disks 
in this homotopy class. There is a natural action by 
translation on the space of such maps, so when 
H(o) = 1 we can divide out by this action and obtain 
an oriented zero-dimensional moduli space M(@). 
Finally, by 2,(¢) and n,,(@) we denote the intersec- 
tion number of such a strip with the divisors 
determined by z and w inside of Sym?YX. The 
requirement that they vanish forces the strip to lie 


in Sym*(X—z-— w). It can be shown that, for 


QD = T(x, y), 
j(x) 7 i(y) m n;(o) 7 ny(o) [22] 


so j(dox) = j(x). 

When g— 1, computing the differential amounts 
to counting maps of the strip into the Heegaard 
torus. This can be done algorithmically using the 
Riemann mapping theorem, so computation of Hy is 
purely combinatorial. Knots of this form are called 
(1,1) knots. They are one of our few windows into 
the behavior of Hy for large knots. 

As an example, consider the diagram of Figure 3a. 
The two shaded regions represent the domains 
of classes $1 € 72(pi,p2) and 3 € m2(ps. pz). 
The Riemann mapping theorem implies that up 
to reparametrization, there is a unique holo- 
morphic map of the strip into each region, so 
#M(o1)= +1=#M(d2). The differential in 
Co(D 1) is given by 


do(p1) = Xp» = do(p3) 
do(p2) = 0 


and Ho(U) & Z. This reflects the fact that we could 
have chosen the more efficient diagram of S? — U 
shown in Figure 1, simply by moving / to remove 
two of the intersection points. 

For comparison, consider the diagram for the 
trefoil shown in Figure 3b. All three generators of 
Co(D2) have different j-gradings, so we must have 
do = 0. Thus, Ho(T) = Z?. The two disks ¢; and 4» 
are still present, but now 7;(61)—75:, ($5) — 1, so 
neither disk contributes to the differential. This is 
reflected in the fact that 3; cannot be moved to 
reduce the number of intersection points without 
passing through either z or w. 


Deformations 


In this case, finding an appropriate deformation of 
Co(D) is simple: we just drop the condition that 
n.(@)=0 in the definition of the differential. If a 
homotopy class ó € 72(x,y) contributes nontrivially 
to the sum, it must have a holomorphic representative, 
which necessarily intersects the divisor in Sym?X 
defined by z non-negatively. Thus, 7,(ó) > 0. From 
[22], it follows that j(x) —j(y)=n-(¢) > 0, so this 
new differential has the form dọ + dọ, where dj 
strictly lowers the j-grading. 

The fact that the homology of Co(D) with respect 
to the perturbed differential is Z goes back to the 
knot Floer homology's roots in Heegaard Floer 
homology. By dropping the condition that 
n,;($)—0, we have effectively forgotten about the 
basepoint z, and thus about the knot. The new 


Knot Homologies 213 


complex simply computes the Heegaard Floer group 
HF(S?), which is isomorphic to Z. When g=1, this 
can be seen directly: if we remove the basepoint z, 
any genus-1 Heegaard diagram of S? can be isotoped 
into the standard diagram of Figure 1. 


Construction of H» 


In this case, the geometric data D needed to define 
the chain complex C;(D) is a planar diagram of 
the knot, and the classical model on which the 
construction of C;(D) is based is the Kauffman state 
model for the Jones polynomial. There is a related 
homology theory H2(D), known as the unreduced 
Khovanov homology, whose graded Euler character- 
istic is (q + 4! )P5(K). This is the original categor- 
ification of the Jones polynomial defined in 
Khovanov (2000). 

To construct C2(D), we consider complete resolu- 
tions of the planar diagram D. As shown in Figure 4, 
there are two different ways to resolve each crossing 
of D. If D has n crossings, there will be 2" ways to 
resolve all n, one for each vertex of the cube [0, 1]". 
To a vertex v, we associate the crossingless planar 
diagram D, obtained from the corresponding reso- 
lution of D. Thus, each vertex of the cube is 
decorated by a 1-manifold D,. 

If e is an edge joining vertices vo and vı (where vo 
has one more 0 coordinate than vı), we write 
€:U9— v1, and decorate e with a two-dimensional 
cobordism $, from D,, to D,,. S, is a product 
cobordism outside a neighborhood of a single 
crossing, where it is the one-handle cobordism 
between the O-resolution and the 1-resolution. The 
resulting cobordism is necessarily composed of 
a union of product cobordisms (cylinders) together 
with a single nontrivial cobordism (a pair of pants). 
Thus, starting from D, we have constructed an 
n-dimensional cube whose vertices are decorated by 
1-manifolds and whose edges are decorated by 
cobordisms between them. This is the cube of 
resolutions of D. | 

The next step in the construction of C2(D) is to 
apply a graded (1 + 1)-dimensional TQFT A to the 
cube of resolutions. A is a functor which associates 
to each 1-manifold X a group .A(X), and to each 
two-dimensional cobordism W : X, —^ X5; a homo- 
morphism .A(W) :.A(X1) 5 A(X2). If we apply A to 
all the manifolds and cobordisms of the cube of 
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Figure 4 0- and 1-resolutions of a crossing. 
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Table 1 Summary of cube of resolutions 


Vertex v —  tmanifold D, — Group .A(D,) 


Homomorphism 
A(Se) : A(Dy,) + .A(Dy,) 


Cobordism — 
Sa $ Dy, =F Dy, 


Edge = 
e: V — Vo 


resolutions, we obtain a new cube, decorated with 
groups and cobordisms between them. This process 
is summarized in Table 1. g 

We can now describe the chain complex C2(D). 
As a group, 


C2(D) = A(D;) [23] 


where the sum runs over all vertices of the cube of 
resolutions. For x € A(D,), the differential is given by 


dix = X (-1)* A(S,)(x) [24] 


ev! 


The signs in this sum are determined by assigning a 
sign (— 1)? to each edge e in such a way that every 
two-dimensional face of the cube has an odd 
number of — signs on its edges. (This ensures that 
d? — 0.) There are many ways to do this, but they all 
result in isomorphic complexes. g 

The homological grading i on C2(D) is easily 
determined. For x € A(D,), we set i(x)=i(v) — c(D), 
where i(v) is the sum of all the coordinates of v, and 
c(D) is a constant. Clearly, i(d2x) =i(x) + 1. In order 
to have invariance, it turns out that c(D) must be 
chosen to be equal to the number of negative 
crossings in D. 

It remains to specify the TQFT .A. At the level of 
groups, A(S!) is a free abelian group of rank 2: 


A(S!) = A = (1, X) [2.5] 


General principles then imply that 


A( IIs") — A8" 26] 


To specify the maps induced by cobordisms, it is 
enough to describe the maps associated to the two 
pairs of pants shown in Figure 5. They are given by 


A: A — AQA 
Figure 5 Maps induced by pairs of pants. 


m: AG9A —- A 


m(1&1)21 
A(1)=1@X+xX@1 [27] 


m1@X)=m(X@1)=xX 
A(X) =X @xX [28] 


m(X&X)-0 [29] 


Note that the multiplication m makes A into a 
commutative ring isomorphic to Z[X]/(X7). 

A is a graded TQFT. In other words, there is a 
grading g on A and its tensor products, determined by 


q(1)—1 
30 
q(a b) = q(a) +4(b) - 
q(X) ==1 [31] 
From eqns [27]-[29], it is easy to see that 
q(m(a & b)) = q(a & b) - 1 32 


q(A(a)) = q(a) - 1 


If we define j(x) =k(D) + g(x) + i(x), it follows that 
J(dox)-—j(x). Taking the graded Euler characteristic 
gives 


~ 


x(C2(D)) = FP S (a+ y B3 


v 


where n, is the number of components of D,. If we 
define k(D) to be the writhe of D, this is precisely 
Kauffman's formula for the unnormalized Jones 
polynomial. f 

Figure 6 illustrates Co(D) for a simple two- 
crossing link. The figure shows the original link (in 
the center), the cube of resolutions, and basis vectors 
for C2(D), together with their j-gradings. We leave it 
to the reader to check that the homology H2(L) is 
four dimensional, supported in j-gradings 1 and 3 at 
the vertex labeled 00, and in gradings 5 and 7 at the 
vertex labeled 11. 
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Figure 6 The cube of resolutions for the Hopf link. 


To get the reduced chain complex C2(D), we must 
divide the graded Euler characteristic by a factor of 
(q+q'). This is accomplished by choosing a 
marked point on K and requiring that for each 
resolution D,, the vector associated to the circle 
containing the marked point lie in the subspace of A 
spanned by X. If D is a diagram of a knot, the 
resulting homology H2(K) is independent of the 
choice of marked point. For links, H2(L) depends on 
the component of the link on which the marked 
point lies. 


Deformations 


Deformations in the N —2 theory are constructed 
using a technique introduced by E S Lee. The idea is 
to replace the graded TQFT .A with a filtered TOFT 
A’. As a group, we still have .A(S') — A, but the 
multiplication and comultiplication maps are per- 
turbations of those for .A: 


m'(1@1)=1 
A'(1)=1@X+X@l1l-riel [34 


m(1@X)=m'(X@1)=xX 
A’(X) =X@X+s51@1 [35] 


m(X@X)=rX+s [36] 


The new terms involving r and s have g gradings 
strictly greater than the terms which are shared with 
eqns [27]-[29]. Thus, the differential defined by 
replacing m and A by m and A’ will be a 
perturbation of the original differential on C;(D). 
The simplicity of the homology with respect to the 
new differential depends on the fact that when the 
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polynomial X? — rX — s has simple roots, the TQFT 
A’ decomposes as a direct sum of two one- 
dimensional TQFTs. This implies that for a knot, 
the deformed homology H;(K) decomposes as a 
direct sum of two copies of H;(K). This group is 
always isomorphic to Z, so H^(K) c Zo Z. 1f s-—0, 
the same strategy can be used to define deformations 
of the reduced chain complex C2(D). In this case, we 
find that the deformed homology is isomorphic to a 
single copy of Z. 


See also: Floer Homology; Gauge Theory: Mathematical 
Applications; The Jones Polynomial; Knot Theory and 
Physics; Topological Quantum Field Theory: Overview. 
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Introduction 


As in all other physical theories, one expects that 
gravitational phenomena will ultimately be ruled by 
quantum mechanics. This requires to consider the 
quantization of the best available theory of gravity, 
namely Einstein’s general relativity. This problem has 
been considered since the 1930s (see Loop Quantum 


Gravity). The application of the rules of quantum 
mechanics to general relativity is immediately problem- 
atic. Unlike other physical interactions, general 
relativity describes gravitational phenomena through a 
distortion of spacetime rather than through a field living 
in spacetime. Therefore, its quantization is bound to be 
very different from that of other physical theories. In 
particular, the well-established framework of perturba- 
tive quantum field theory, used with remarkable success 
in describing electroweak and strong interactions (in the 
latter case at least in certain regimes), runs into trouble 
when applied to general relativity. At present, it is not 
clear if this is a fundamental problem or if there might 
exist an implementation of perturbative quantum field 
theory that works well in the gravitational case. On the 
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other hand, there exist examples of field theories where 
perturbative methods fail but that nevertheless can be 
quantized. This suggests that the consideration of 
nonperturbative techniques in the quantization of the 
gravitational field could be a promising avenue. 

In particular, canonical quantization methods 
appear attractive for attempting a nonperturbative 
quantization of gravity. Canonical methods force 
the introduction, in a clear way, of a Hilbert space 
of states and definition of the quantum operators of 
interest. The application of canonical methods to 
classical general relativity was pioneered by Dirac 
and Bergmann in the late 1950s. During the 1960s, 
the resulting canonical theories were considered in a 
quantum setting by DeWitt. At the time it appeared 
that making progress in the canonical quantization 
of general relativity was going to be quite a 
challenge. In particular, the canonical theory has 
constraints, which have to be implemented as 
operator identities quantum mechanically. The 
wave functions were functionals of the spatial metric 
of spacetime. One of the operator identities to 
be satisfied implies that the wave functions only 
depend on properties of the spatial metric that 
are invariant under spatial diffeomorphisms. This 
is a direct consequence of general relativity being 
a theory that is independent of coordinate choice 
since a diffeomorphism changes the assignment of 
coordinates to points in the manifold. Finding such 
wave functions already presented a challenge, since 
there is no well-grounded mathematical theory of 
functionals of diffeomorphism-invariant classes of 
metrics. Moreover, the other operator identity to be 
imposed, known as the Hamiltonian constraint or 
Wheeler-DeWitt equation, was a nonpolynomial 
complicated operator equation that does not admit 
a simple geometrical interpretation and needs to be 
regularized. Since one does not have a background 
metric to rely upon, traditional regularization 
techniques of quantum field theory are not suitable 
to deal with the Hamiltonian constraint. 

These difficulties severely hampered development 
of canonical methods for the quantization of general 
relativity for approximately two decades. The 
situation started to change when Ashtekar noticed 
that one could choose a different set of variables 
to describe general relativity canonically. Instead of 
using as variable the spatial metric qap, Ashtekar 
chooses to use a set of (densitized) frame fields Ef. 
The relationship between the metric and the 
densitized frames is det (q*’)q*” = E?E^ and we are 
assuming the Einstein summation convention, that 
is, the index 7 is summed from 1 to 3 (such an index 
labels which vector in the triad one is referring to). 
The resulting theory has an additional symmetry 


with respect to usual general relativity, in the sense 
that it is invariant under the choice of frame. This 
symmetry operates on the index ; as if it were 
an SO(3) symmetry. As canonical momenta the 
usual choice is to pick the extrinsic curvature of the 
3-geometry. Ashtekar chooses a variable related to it 
that behaves under frame transformations as an 
SO(3) connection, A'. The resulting theory is there- 
fore cast in terms of a canonical pair (E; A‘), with i 
an SO(3) index. One can therefore consider the 
canonical pair as that of a Yang-Mills theory 
associated with the SO(3) group. In fact, associated 
with the extra symmetry under triad rotations the 
theory has a new set of constraints that take 
the form of a Gauss law, D,E, =0 with D, the 
covariant derivative formed with the connection A’. 
This allows us to view the phase space of a Yang- 
Mills theory as the kinematical arena on which to 
discuss quantum gravity. The theory is of course 
different from the Yang-Mills theory. In particular, 
it still has constraints that imply that it is invariant 
under spacetime diffeomorphisms. In the canonical 
picture, these constraints appear asymmetrically as 
one constraint is associated with time evolution 
(“Hamiltonian constraint") and a set of three 
constraints is associated with spatial diffeomorph- 
isms (“diffeomorphism constraint"). 

If one quantizes the theory starting from the 
Ashtekar formulation, given the resemblance with 
Yang-Mills theory, the natural choice for a represen- 
tation of the quantum wave functions is to consider 
wave functions of the connection W[A] that are 
invariant under SO(3) transformations. Such a repre- 
sentation is known as “connection representation.” 
There is significant experience in Yang—Mills theory in 
constructing such wave functions. In particular, it is 
known that if one considers the parallel transport 
operator defined by a connection around a closed 
curve (holonomy) and one takes its trace (“Wilson 
loop”), the resulting object is invariant under SO(3) 
transformations. What is more important, the set of 
traces of holonomies along all possible closed loops is 
an overcomplete basis for all gauge-invariant func- 
tions. More recently, it has been shown that one can 
construct a less redundant complete basis using 
techniques from spin networks. We will discuss later 
on how to do this. 

Since any gauge-invariant functional can be 
expanded in the basis of Wilson loops, one can 
choose to represent it through the coefficients of 
such an expansion. These coefficients are functions 
of the curve upon which the corresponding element 
of the basis of Wilson loops is based. The 
representation of wave functions in terms of such 
coefficients is called “loop representation." Wave 


functions in the loop representation are functions of 
a closed curve (more precisely of families of closed 
curves, or spin networks, as we will discuss below). 

We still have to deal with the diffeomorphism 
and Hamiltonian constraints. The diffeomorphism 
constraint when written in the loop representation 
implies that the wave functions are not functions of 
loops but rather of topologically invariant properties of 
the loops under general diffeomorphisms of the spatial 
manifold containing the loops. Such functions are 
technically known in the mathematical literature as 
“knot invariants.” This is the first point of connection 
between knot invariants and quantum gravity; they 
constitute the kinematical arena of the theory. One still 
has to deal with the Hamiltonian constraint, which has 
to be imposed as an operator equation. We shall see that 
knot theory also seems to have a lot to say about 
solutions of the Hamiltonian constraint. This is quite 
remarkable, since the Hamiltonian constraint embodies 
in detail the specific dynamics of Einstein’s theory of 
gravitation, and to our knowledge this is an input that 
has never gone into the ideas of knot theory. 

In terms of the Ashtekar variables, the Hamiltonian 
constraint takes the form 


H = E*-E? x (B° + AE‘) eabe H 


where we have used a conventional vector notation 
for the frame indices and kept explicit the spatial 
indices. c,5, is the Levi-Civita totally antisymmetric 
tensor. We have included a possible cosmological 
constant A. The Ashtekar formulation can be 
constructed in different ways. In the original 
formulation, the connection A! was a complex 
variable and the Hamiltonian took the form we 
listed above. However, the resulting theory was only 
equivalent to real general relativity if the variables 
satisfied certain reality conditions. One can choose 
to use a real connection instead, but then the 
Hamiltonian constraint Has additional terms. At 
the moment, we will concentrate on the constraint 
as listed above. The constraint has to be implemen- 
ted as a quantum operator acting on wave functions. 
Since it involves the product of operators, it needs to 
be regularized. Most regularization methods are 
problematic in this context, since they use a metric, 
and here the metric is a quantum operator, not an 
external fixed quantity. If we ignore these difficul- 
ties, one observes that, if one were to choose a 
quantum state, for instance in the connection 
representation, for which, 


AEF W[A] = —B;v[A] |2] 


the state would be annihilated by the Hamiltonian 
constraint, and this would be true no matter what 
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regularization was chosen. Classically, the condition 
E? ~ B? is satisfied for the de Sitter geometry, so one 
could envision the state as a quantum state 
associated with such geometry. The exact solution 
of the above equation is given by a state that is the 
exponential of the integral on the spatial slice of the 
Chern-Simons form built from the connection 


Vcs[A] = exp gi dx tr(A ^ dA 
2 
+ 34 ^A ^ A)) [3] 


and the constant k needs to be chosen as k=6/A for 
the state to be a solution. 

One can ask, “what is the expression of this state 
in the loop representation?” To answer this, one 
needs to compute the coefficients of its expansion in 
the basis of Wilson loops W.[A], where as we stated 
earlier, y should be a collection of (intersecting) 
loops (later we will discuss the generalization to spin 
networks). The expression for the coefficients will 
be a function only of the loops y and is given by 


V cs [y] =} DA W, [A] V cs [A] [4] 


This expression is invariant under diffeomorph- 
isms of the manifold or, equivalently, under smooth 
deformations of the curve y. That is, it is what in the 
mathematical literature is called “knot invariant.” In 
fact, this integral has been studied by Witten in the 
context of Chern-Simons theory and has been 
shown to be related to the Kauffman bracket knot 
polynomial, which in turn is related to the cele- 
brated Jones polynomial. Therefore, the implication 
of these results is that the Kauffman bracket knot 
polynomial appears to be the representation in the 
loop representation of a state of quantum gravity 
that solves the quantum Einstein equations (with a 
cosmological constant). The reader may be intrigued 
by the word “polynomial” in this context. It should 
be noted that the Chern-Simons state Wcs|A| 
depended on a parameter k, which had to take a 
certain value for it to solve the quantum Einstein 
equations. The resulting knot invariant is a poly- 
nomial in exp(k). If one expands out the result, an 
infinite power series in k results. There will be 
infinite coefficients in the series, but they are just 
combination of the finite number of coefficients of 
the polynomial. Knot polynomials are a powerful 
tool for analyzing and distinguishing knots. The 
coefficients of the polynomials are all knot invari- 
ants. Typically, for “simple” knots, the first few 
coefficients of the knot polynomial are nonzero. As 
one considers more complicated knottings, higher 
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coefficients become nonvanishing. The ultimate goal 
of knot theory is to be able to consider two arbitrary 
knots and to unambiguously determine if the two 
knots are related by a smooth transformation. The 
knot polynomials appear as promising tools for 
achieving this task that has remained elusive up 
to now. 

Returning to quantum gravity, to have a well-known 
knot polynomial as a solution of the quantum Einstein 
equations is a remarkable fact. The first connection we 
outlined between knot theory and quantum gravity was 
less unexpected: if one describes a theory that is 
diffeomorphism invariant in terms of loops, the 
appearance of knots is inevitable. But we are now 
finding that knot invariants from the mathematical 
literature, which were constructed without any knowl- 
edge of the details of the dynamics of the Einstein 
equations, seem to manage to solve such equations. This 
is either a big coincidence or a pointer to some 
unexplained deep connection yet to be understood. 
Notice, for instance, that other theories of gravity would 
not have the Kauffman bracket as a quantum state. 

There is a certain technicality about the Kauffman 
bracket that makes it difficult to argue with precision 
that it is a state of quantum gravity. To understand 
this technicality better, it is perhaps best to concen- 
trate on the form of the quantum state written above 
if the connection is an abelian connection. In that 
case, the integral in question, 


ANN," Pe / DA 1 dy" exp(iA,) 


x exp( [hA 2A.) [5] 


by turning it into a Gaussian integral. The result is 


(x — yy 
Vy `S abelian [Y] = f dx’ / dy’ €abe 6 
CS | | ds Ja |x u yl | 


This integral has problems, since the integrand is 
ill-defined when x=y. Notice that the integral 
would be well defined if the two contour integrals 
were evaluated on different, nonintersecting curves. 
The result would be the well-known formula for 
the Gauss linking number of the curves, yielding 
zero if they are not linked and and integer multiple 
of 4r if they were. So the integral we were trying to 
compute was actually the Gauss linking number of 
the curve with itself. Such a quantity is not well 
defined for ordinary curves. To deal with this 
problem, mathematicians introduced the concept of 
framed knots. A framed knot is a curve with a 
prescription to determine a second curve from it. 
One way to see it is to construct another curve that 
is “infinitesimally close” in space to the original 


one. It is clear that there is no canonical way to 
compute such a second curve. Then, when one 
considers quantities like the self-linking number, 
one makes them well defined by evaluating the two 
integrals on the two curves, the original one and 
the one yielded by the prescription. In reality, the 
notion of framing is a bit more elaborate than what 
we hint at here, since one could consider invariants 
constructed with more than two integrals and could 
still be ill-defined if one only considers two curves. 
The notion has to be extended as well to handle 
intersections in the curves. We will ignore these 
subtleties in this discussion. 

The Kauffman bracket knot invariant is an 
invariant of framed knots, just like the self-linking 
number. It is not well defined for a single curve. It 
requires a framing of the knot. In quantum gravity, 
there is no compelling reason to consider framed 
curves. It is true that framed curves arise naturally in 
q-deformed field theories and perhaps a q-deformed 
version of quantum gravity is what needs to be 
considered to accommodate the Chern-Simons state, 
but at the moment there are no proposals along 
these lines that have widespread consensus. 

So, it appears the Kauffman bracket does not have 
a natural role to play as a state of quantum gravity. 
However, it is known that the frame dependence of 
the Kauffman bracket knot polynomial can be 
captured in an overall factor that depends on the 
self-linking number. If one strips the polynomial of 
this factor, one gets the Jones polynomial, which is a 
knot invariant of single curves. Could it be that this 
polynomial has a chance of being a solution of the 
quantum Einstein equations? 

To determine this, the analogy with Chern- 
Simons theory is no longer useful, since there is no 
straightforward way to transform the relation 
between the Kauffman and Jones polynomials into 
relations between states in the connection represen- 
tation. To analyze if the Jones polynomial could be 
a solution of the quantum Einstein equations, one 
needs to write the quantum Einstein equations 
directly in terms of loops. 

There have been several attempts to rewrite the 
quantum Einstein equations directly in the loop 
representation. [n one of these attempts, the curva- 
ture that appears in the Hamiltonian constraint was 
represented by the *loop derivative." This is a 
differential operator that can be introduced in the 
space of loops by considering that two loops that 
differ by a small element of area are “close.” One 
can build an attractive differential calculus in loop 
space that actually encodes many of the kinematical 
properties that are useful to formulate Yang-Mills 
theory. 


The Hamiltonian constraint in terms of the loop 
derivative is an operator that has an explicit form. 
The coefficients of the Jones polynomial can also be 
given an explicit form by computing perturbatively 
the integral in the Chern-Simons theory. The results 
are generalizations of the types of integrals that arise 
in the self-linking number, but involving a larger 
number of integrals. One can therefore envisage 
carrying out an explicit computation in which one 
checks if the coefficients of the Jones polynomial are 
annihilated or not by the Hamiltonian constraint of 
quantum gravity in the loop representation. Such a 
calculation has been carried out for the first few 
coefficients. It turns out that the second coefficient 
(the first coefficient is normalized to unity, so it 
trivially satisfies the constraint) is indeed annihilated 
by the Hamiltonian constraint of vacuum quantum 
gravity (with zero cosmological constant). It has 
been shown that the third coefficient is not, and 
there are good arguments to indicate that other 
coefficients will not be states of quantum gravity. 

So, a remarkable result has been found in that one 
of the coefficients of the Jones polynomial (related 
to the Arf and Casson invariants) is annihilated by a 
version of the quantum Hamiltonian constraint of 
general relativity. The result is quite nontrivial; it 
requires a fair amount of calculation to actually 
show that the coefficient is annihilated. The mean- 
ing of this quantum state and the deep reason why it 
is annihilated remain at present a mystery. 

The quantum Hamiltonian constraint based on the 
loop derivative makes certain assumptions about the 
space of functions one is using to quantize the theory. 
In quantum field theory, not all classical operators 
have a well-defined quantum counterpart. The choice 
being made is to assume that the curvature F,,, is a 
well-defined quantum operator defined by the loop 
derivative. Differentiability of knot polynomials is 
not a new idea. It is the core idea of the Vassiliev knot 
invariants, which are defined by a set of identities, 
one of them acting as a *derivative in knot space." It 
can be shown that the loop derivative is a concrete 
implementation of the Vassiliev derivative and, there- 
fore, Vassiliev invariants are the “arena” in which this 
version of quantum gravity takes place. 

The Hamiltonian based on the loop derivative has 
problems, in the sense that it is obtained by a 
regularization procedure that requires extra external 
geometric structures. This is common practice in 
Yang-Mills theory, where one has at hand a fixed 
external background metric. However, in gravity the 
geometry is a dynamical object and, if one con- 
structs expressions that resort to some fixed external 
geometry, one gets inconsistencies. In particular, it is 
expected that the Hamiltonian based on the loop 
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derivative will not reproduce the correct Poisson 
algebra of canonical general relativity. This sort of 
problem plagued early attempts to construct a 
quantum version of the Hamiltonian constraint in 
the early 1990s. 

A point that we mentioned earlier but did not 
elaborate upon, is that the Wilson loops constitute an 
overcomplete basis of states. Therefore, if one takes a 
quantum state and expands it on such a basis, one gets 
that the coefficients of the expansion satisfy certain 
identities, called the Mandelstam identities. These are 
nonlinear identities that states in the loop representa- 
tion have to satisfy. These identities are very incon- 
venient at the time of constructing quantum states. The 
identities stem from the fact that if one chooses a 
matrix representation of the group of interest, the fact 
that one is in a given representation is indicated by 
certain identities the matrices satisfy. To break free 
from these constraints, one possibility is to consider 
multiple representations when constructing Wilson 
loops. To do this, one considers piecewise-continuous 
graphs with intersections (the nonintersecting case is 
a trivial subcase). Along the lines connecting the 
intersections one considers holonomies in a given 
representation for a given line. In the case of the group 
SU(2), which is the one of interest in quantum gravity, 
such representations are labeled by a (half-) integer. 
One then considers invariant tensors in the group to 
“tie the holonomies together" at intersections. The 
resulting object is a gauge-invariant object for a given 
connection based on a “spin network." The latter 
is an embedded piecewise-continuous graph with an 
assignment of integers to each of its lines and an 
assignment of “intertwiners” at each intersection (if 
the intersections are trivalent or lower, one can choose 
canonical intertwiners and forget about them). 

One can then consider the *spin network represen- 
tation" in which one expands gauge-invariant states 
in terms of the basis of Wilson nets. Knot polynomials 
for these types of graphs have been considered in the 
mathematical literature (*polynomials of colored 
graphs"). The construction with the Chern-Simons 
state can be repeated, and there exist suitable general- 
izations of the Kauffman bracket and Jones polyno- 
mials. The Hamiltonian based on the loop derivative 
can also be introduced in this context; again, its action 
is well defined on suitable generalizations of Vassiliev 
invariants for these kinds of graphs. This opens the 
possibility of encoding the quantum dynamics of 
general relativity as a combinatorial action in the 
space of Vassiliev invariants. 

An alternative Hamiltonian based on assuming that 
the holonomies and the volume operators are well 
defined quantum mechanically (but not the curvature) 
has been introduced that has the advantage of not 
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requiring external structures for its regularization. In 
fact, it can be explicitly checked that it satisfies the 
correct Poisson algebra without anomalies at the 
quantum level. The exploration of the action of this 
Hamiltonian constraint on knot polynomials has not 
been carried out as systematically as for the one based 
on the loop derivative, but it has been explicitly shown 
that the first coefficient in the expansion of the Jones 
polynomial is annihilated by this Hamiltonian con- 
straint. The first coefficient, written in terms of loops, 
was simply the numeral 1 and was automatically 
annihilated. In terms of spin network states, the first 
coefficient is the *chromatic evaluation" of the net- 
work (the result of computing the Wilson loop on a 
connection that is pure gauge). It is somewhat 
nontrivial to show that this quantity is actually 
annihilated by the Hamiltonian constraint in question. 
At the moment, the issue of what the correct 
Hamiltonian constraint is that describes a realistic 
and physically correct theory of quantum gravity is 
still open to debate. There are certain concerns that 
the action of the operators considered up to now is 
too simple to encompass the true dynamics of 
general relativity. Constructing a semiclassical the- 
ory that could confirm or deny the viability of the 
proposals is a complicated task, since one has to 
make contact with physics that is not diffeomor- 
phism invariant in the context of a theory that is. 
Moreover, in canonical quantum gravity, there 
exists the “problem of time.” Since the Hamiltonian 
vanishes, the dynamics implied by it is trivial, and 
one has to disentangle the true dynamics by 
relational constructions among the variables of the 
theory. One then needs to compare the resulting 
predictions with classical general relativity. 
Whether the current proposals are viable and 
whether knot theory will play a role at a “kinematical 
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Introduction 


This article is an introduction to some of the relation- 
ships between knot theory and theoretical physics. 
Knots themselves are macroscopic physical phenomena 
in three-dimensional space, occurring in rope, vines, 
telephone cords, polymer chains, DNA, certain species 
of eel, and many other places in the natural and man- 
made world. The study of topological invariants of 


level" or it will actually play a key role in the detailed 
dynamics of quantum general relativity is yet to be 
seen. It is reassuring that in partial constructions, 
celebrated knot polynomials have appeared to have 
some knowledge of the dynamics of the Einstein 
equations. 

Quantum gravity being an unfinished symphony, 
we cannot entirely conclude how great an impact 
knot theory will have on it in the end. One can only 
note that beautiful mathematical results seem to tie 
in naturally with the partial constructions that have 
been carried out thus far. 
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knots leads to relationships with statistical mechanics 
and quantum physics. This is a remarkable and deep 
situation where the study of a certain (topological) 
aspects of the macroscopic world is entwined with 
theories developed for the subtleties of the microscopic 
world. The present article is an introduction to the 
mathematical side of these connections, with some 
hints and references to the related physics. 

We begin with a short introduction to knots, 
links, braids, and the bracket polynomial invariant 
of knots and links. The article then discusses 
Vassiliev invariants of knots and links, and how 
these invariants are naturally related to Lie algebras 
and to Witten’s gauge-theoretic approach. This part 


of the article is an introduction to how Vassiliev 
invariants in knot theory arise naturally in the 
context of Witten's functional integral. 

The article is divided into several sections beyond 
the introduction. Section two is a quick introduction 
to the topology of knots and links. The third one 
discusses Vassiliev invariants and invariants of rigid 
vertex graphs. The fourth section introduces the 
basic formalism and shows how Witten's functional 
integral is related directly to Vassiliev invariants. 
The fifth section discusses the loop transform and 
loop quantum gravity in this context. The final 
section is an introduction to topological quantum 
field theory, and to the use of these techniques in 
producing unitary representations of the braid 
group, a topic of intense interest in quantum 
information theory. 


Knots, Braids, and Bracket Polynomial 


The purpose of this section is to give a quick 
introduction to the diagrammatic theory of knots, 
links, and braids. A knot is an embedding of a circle in 
three-dimensional space, taken up to ambient isotopy. 
That is, two knots are regarded as equivalent if one 
embedding can be obtained from the other through a 
continuous family of embeddings of circles in 3-space. 
A link is an embedding of a disjoint collection of 
circles, taken up to ambient isotopy. Figure 1 illus- 
trates a diagram for a knot. The diagram is regarded 
both as a schematic picture of the knot, and as a plane 
graph with extra structure at the nodes (indicating 
how the curve of the knot passes over or under itself 
by standard pictorial conventions). 

Ambient isotopy is mathematically the same as 
the equivalence relation generated on diagrams by 
the Reidemeister moves. These moves are illustrated 
in Figure 2. Each move is performed on a local part 
of the diagram that is topologically identical to the 
part of the diagram illustrated in this figure (these 
figures are representative examples of the types of 
Reidemeister moves) without changing the rest of 
the diagram. The Reidemeister moves are useful in 
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Figure 1 A knot diagram. 
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Figure 2 The Reidemeister moves. 
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doing combinatorial topology with knots and links, 
notably in working out the behavior of knot 
invariants. A knot invariant is a function defined 
from knots and links to some other mathematical 
object (such as groups or polynomials or numbers) 
such that equivalent diagrams are mapped to 
equivalent objects (isomorphic groups, identical 
polynomials, identical numbers). 

Another significant structure related to knots and 
links is the Artin braid group. A braid is an 
embedding of a collection of strands that have 
their ends in two rows of points that are set one 
above the other with respect to a choice of vertical. 
The strands are not individually knotted and they 
are disjoint from one another. See Figures 3-5 for 
illustrations of braids and moves on braids. Braids 
can be multiplied by attaching the bottom row of 
one braid to the top row of the other braid. Taken 
up to ambient isotopy, fixing the endpoints, the 
braids form a group under this notion of multi- 
plication. In Figure 3 we illustrate the form of the 
basic generators of the braid group, and the form of 
the relations among these generators. Figure 4 
illustrates how to close a braid by attaching the 
top strands to the bottom strands by a collection of 
parallel arcs. A key theorem of Alexander states that 
every knot or link can be represented as a closed 
braid. Thus, the theory of braids is critical to the 
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Figure 3 Braid generators. 
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Figure 4 Closing braids to form knots and links. 


theory of knots and links. Figure 5 illustrates the 
famous Borrowmean rings (a link of three unknotted 
loops such that any two of the loops are unlinked) 
as the closure of a braid. 

We now discuss a significant example of an 
invariant of knots and links, the bracket polynomial. 
The bracket polynomial can be normalized to 
produce an invariant of all the Reidemeister moves. 
This normalized invariant is known as the Jones 
(1985) polynomial. The Jones polynomial was 
originally discovered by a different method than 
the one given here. 

The bracket polynomial, (K) — (K)(A), assigns to 
each unoriented link diagram K a Laurent poly- 
nomial in the variable A, such that 


1. If K and K' are regularly isotopic diagrams, then 
(K) = (K^). 

2. If K II O denotes the disjoint union of K with an 
extra unknotted and unlinked component O (also 
called “loop” or “simple closed curve" or 
“Jordan curve”), then 


(KILO) = 6(K) 
where 


f= —A* —A~ 


3. (K) satisfies the following formulas: 


b CL(b) 


Figure 5 Borromean rings as a braid closure. 


where the small diagrams represent parts of 
larger diagrams that are identical except at the 
site indicated in the bracket. We take the 
convention that the letter chi, xy, denotes a 
crossing where the curved line is crossing over 
the straight segment. The barred letter denotes 
the switch of this crossing, where the curved line 
is undercrossing the straight segment. 


In computing the bracket, one finds the following 
behavior under Reidemeister move I: 


(y) 2 —4*(—) 
and 
(7) --A^7(—) 


where ^ denotes a curl of positive type as indicated 
in Figure 6, and y indicates a curl of negative type, 
as also seen in this figure. The type of a curl is the 
sign of the crossing when we orient it locally. Our 
convention of signs is also given in Figure 6. Note 
that the type of a curl does not depend on the 
orientation we choose. The small arcs on the right- 
hand side of these formulas indicate the removal of 
the curl from the corresponding diagram. 

The bracket is invariant under regular isotopy and 
can be normalized to an invariant of ambient 
isotopy by the definition 


fx(A) = (—A3) “9 (K)(A) 


where we chose an orientation for K, and where 
w(K) is the sum of the crossing signs of the oriented 
link K. w(K) is called the writhe of K. The 


convention for crossing signs is shown in Figure 6. 


The State Summation 


In order to obtain a closed formula for the bracket, 
we now describe it as a state summation. Let K be 
any unoriented link diagram. Define a state, $, of K 
to be a choice of smoothing for each crossing of K. 
There are two choices for smoothing a given 
crossing, and thus there are 2^ states of a diagram 
with N crossings. In a state we label each smoothing 
with A or A! as in the expansion formula for the 
bracket. The label is called a vertex weight of the 
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Figure 6 Crossing signs and curls. 
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state. There are two evaluations related to a state. 
The first one is the product of the vertex weights, 
denoted (K|S). The second evaluation is the number 
of loops in the state $, denoted ||S||. 

Define the state summation, (K), by the formula 


(K) = 5 (K|S)el*I-" 
S 
It follows from this definition that (K) satisfies the 
equations 


The first equation expresses the fact that the entire 
set of states of a given diagram is the union, with 
respect to a given crossing, of those states with an 
A-type smoothing and those with an A-type 
smoothing at that crossing. The second and the 
third equation are clear from the formula defining 
the state summation. Hence, this state summation 
produces the bracket polynomial as we have 
described it at the beginning of the section. 


Remark By a change of variables one obtains the 
original Jones polynomial, Vx(t), for oriented knots 
and links from the normalized bracket: 


Vet) = fc") 


Remark The bracket polynomial provides a con- 
nection between knot theory and physics, in that the 
state summation expression for it exhibits it as a 
generalized partition function defined on the knot 
diagram. Partition functions are ubiquitous in 
statistical mechanics, where they express the sum- 
mation over all states of the physical system of 
probability weighting functions for the individual 
states. Such physical partition functions contain 
large amounts of information about the correspond- 
ing physical system. Some of this information is 
directly present in the properties of the function, 
such as the location of critical points and phase 
transition. Some of the information can be obtained 
by differentiating the partition function, or perform- 
ing other mathematical operations on it. 


In fact, by defining a generalization of the bracket 
polynomial, defined on knot diagrams but not 
invariant under the Reidemeister moves, we can 
capture significant partition functions that are 
physically meaningful. There is no room in this 
survey to detail how this generalization can be used 
to express the Potts model for planar graphical 
configurations, and how it expresses the relationship 
between the Potts model and the Temperley—Lieb 


algebra in diagrammatic form. There is much more 
in this connection with statistical mechanics in that 
the local weights in a partition function are often 
expressed in terms of solutions to a matrix equation 
called the Yang-Baxter equation, that turns out to 
fit perfectly invariance under the third Reidemeister 
move. As a result, there are many ways to define 
partition functions of knot diagrams that give rise to 
invariants of knots and links. The subject is 
intertwined with the algebraic structure of Hopf 
algebras and quantum groups, useful for producing 
systematic solutions to the Yang-Baxter equation. 
In fact, Hopf algebras are deeply connected with 
the problem of constructing invariants of three- 
dimensional manifolds in relation to invariants of 
knots. We have chosen, in this survey article, not to 
discuss the details of these approaches, but rather to 
proceed to Vassiliev invariants and the relationships 
with Witten's functional integral. The reader is 
referred to Kauffman (1987, 1994, 2002), Jones 
(1985), and Reshetikhin and Turaev (1991) for 
more information about relationships of knot theory 
with statistical mechanics, Hopf algebras, and 
quantum groups. For topology, the key point is 
that Lie algebras can be used to construct invariants 
of knots and links. This is shown nowhere more 
clearly than in the theory of Vassiliev invariants that 
we take up in the next section. 


Vassiliev Invariants and Invariants 
of Rigid Vertex Graphs 


In this section we study the combinatorial topology 
of Vassiliev invariants. As we shall see, by the end of 
this section, Vassiliev invariants are directly con- 
nected with Lie algebras, and representations of Lie 
algebras can be used to construct them. This aspect 
of link invariants is one of the most fundamental for 
connections with physics. Just as symmetry con- 
siderations in physics lead to a fundamental rela- 
tionship with Lie algebras, topological invariance 
leads to a fundamental relationship of the theory of 
knots and links with Lie algebras. 

If V(K) is a (Laurent polynomial valued or, more 
generally, commutative ring valued) invariant of 
knots, then it can be naturally extended to an 
invariant of rigid vertex graphs by defining the 
invariant of graphs in terms of the knot invariant via 
an “unfolding of the vertex." That is, we can regard 
the vertex as a “black box" and replace it by any 
tangle of our choice. Rigid vertex motions of the 
graph preserve the contents of the black box, and 
hence implicate ambient isotopies of the link 
obtained by replacing the black box by its contents. 
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Invariants of knots and links that are evaluated on 
these replacements are then automatically rigid vertex 
invariants of the corresponding graphs. If we set up a 
collection of multiple replacements at the vertices 
with standard conventions for the insertions of the 
tangles, then a summation over all possible replace- 
ments can lead to a graph invariant with new 
coefficients corresponding to the different replace- 
ments. In this way, each invariant of knots and links 
implicates a large collection of graph invariants. 

The simplest tangle replacements for a 4-valent 
vertex are the two crossings, positive and negative, and 
the oriented smoothing. Let V(K) be any invariant of 
knots and links. Extend V to the category of rigid 
vertex embeddings of 4-valent graphs by the formula 


V(K,) 2 aV(K,) + bV(K_) - cV(Ko) 


where K, denotes a knot diagram K with a specific 
choice of positive crossing, K denotes a diagram 
identical to the first with the positive crossing 
replaced by a negative crossing and K, denotes a 
diagram identical to the first with the positive 
crossing replaced by a graphical node. 

There is a rich class of graph invariants that can 
be studied in this manner. The Vassiliev invariants 
(Bar-Natan 1995) constitute the important special 
case of these graph invariants where a = +1,b — —1 
and c — 0. Thus, V(G) is a Vassiliev invariant if 


V(R,) = V(K.) — V(K..) 


Call this formula the exchange identity for the 
Vassiliev invariant V. See Figure 7. 

V is said to be of finite type k if V(G)—0 
whenever |G| > k, where |G| denotes the number of 
(4-valent) nodes in the graph G. The notion of finite 
type is of extraordinary significance in studying 
these invariants. One reason for this is the following 
basic lemma. 


Lemma Ifa graph G has exactly k nodes, then the 
value of a Vassiliev invariant vy of type k on G, 
v,(G), is independent of the embedding of G. 


Proof Omitted. o 
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Figure 7 Exchange identity for Vassiliev invariants. 


Figure 8 Chord diagrams. 


The upshot of this lemma is that Vassiliev 
invariants of type k are intimately involved with 
certain abstract evaluations of graphs with k nodes. 
In fact, there are restrictions (the four-term relations) 
on these evaluations demanded by the topology and 
it follows from results of Kontsevich (see Bar-Natan 
(1995) that such abstract evaluations actually deter- 
mine the invariants. The knot invariants derived from 
classical Lie algebras are all built from Vassiliev 
invariants of finite type. All of this is directly related 
to Witten’s functional integral (Witten 1989). 

In the next few figures we illustrate some of these 
main points. In Figure 8 we show how one 
associates a so-called chord diagram to represent 
the abstract graph associated with an embedded 
graph. The chord diagram is a circle with arcs 
connecting those points on the circle that are welded 
to form the corresponding graph. In Figure 9 we 
illustrate how the four-term relation is a conse- 
quence of topological invariance. 

In Figure 10 we show how the four-term relation is a 
consequence of the abstract pattern of the commutator 


Figure 9 The four-term relation from topology. 
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Figure 10 The four-term relation from categorical Lie algebra. 


identity for a matrix Lie algebra. That is, we show how 
a diagrammatic version of the formula 


T? T^ u T" T^ = f^ T* 


fits directly with the four-term relation. The formula 
we have quoted here states that the commutator of 
the matrices T^ and T^ is equal to a sum of the 
matrices T* with coefficients (the structure coeffi- 
cients of the Lie algebra) f4^. Such a relation is the 
most concrete way to define a matrix Lie algebra. 
There are other levels of abstraction that can be 
employed here. The same diagrammatic can be 
interpreted directly in terms of the Jacobi identity 
that defines a Lie algebra. We shall content 
ourselves with this matrix point of view here, and 
add that it is assumed here that the structure 
coefficients are invariant under cyclic permutation, 
an assumption that is not needed in the general case. 
The four-term relation is directly related to a 
categorical generalization of Lie algebras. 

Figure 11 illustrates how the weights are assigned 
to the chord diagrams in the Lie algebra case — by 
inserting Lie algebra matrices into the circle and 
taking a trace of a sum of matrix products. The 
relationship between Vassiliev invariants and Lie 
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Figure 11 Calculating Lie algebra weights. 
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algebras has been known since Bar-Natan’s thesis 
(see also Kauffman (1995). In Bar-Natan (1995) the 
reader will find a good account of Kontsevich’s 
theorem, showing how Lie algebra weight systems, 
and in fact any weight system satisfying the four- 
term relation, can be used to construct knot 
invariants. Conceptually, the ideas behind the 
Kontsevich theorem are directly related to Witten’s 
approach to knot invariants via quantum field 
theory. We give an exposition of this approach in 
the next section of this article. 


Example Let Px(t)=fx(e’) (A=e*) where fg(A) is 
the normalized bracket polynomial invariant dis- 
cussed in the last section. Then Pg(t) is expressed as 
a power series in t with coefficients v,(K), 
n—0,1,2,..., that are invariants of the knot or 
link K. It is not hard to show that these coefficient 
invariants (extended to graphs so that the Vassiliev 
exchange identity is satisfied) are Vassiliev invar- 
iants of finite type. In fact, most of the so-called 
polynomial invariants of knots and links (relatives 
of the bracket and Jones polynomials) give rise to 
Vassiliev invariants in just this way. Thus, Vassiliev 
invariants of finite type are ubiquitous in this area 
of knot theory. One can think of Vassiliev 
invariants as building blocks for the other invar- 
iants, or that these invariants are sources of 
Vassiliev invariants. 


Vassiliev Invariants and Witten’s 
Functional Integral 


Edward Witten (1989) proposed a formulation 
of a class of 3-manifold invariants as generalized 
Feynman integrals taking the form Z(M), where 


Z(M) -- J naga 


Here M denotes a 3-manifold without boundary and 
A 1s a gauge field (also called a gauge potential or 
gauge connection) defined on M. The gauge field is a 
1-form on a trivial G-bundle over M with values in a 
representation of the Lie algebra of G. The group G 
corresponding to this Lie algebra is said to be the 
gauge group. In this integral, the action S(M, A) is 
taken to be the integral over M of the trace of the 
Chern-Simons 3-form AAdA+(2/3)AAAAA. 
(The product is the wedge product of differential 
forms.) Z(M) integrates over all gauge fields modulo 
gauge equivalence. 

The formalism and internal logic of Witten's 
integral supports the existence of a large class of 
topological invariants of 3-manifolds and associated 
invariants of knots and links in these manifolds. 
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The invariants associated with this integral have 
been given rigorous combinatorial descriptions but 
questions and conjectures arising from the integral 
formulation are still outstanding. Specific conjec- 
tures about this integral take the form of just how it 
implicates invariants of links and 3-manifolds, and 
how these invariants behave in certain limits of the 
coupling constant k in the integral. Many conjec- 
tures of this sort can be verified through the 
combinatorial models. On the other hand, the really 
outstanding conjecture about the integral is that it 
exists! At the present time there is no measure 
theory or generalization of measure theory that 
supports it in full generality. Here is a formal 
structure of great beauty. It is also a structure 
whose consequences can be verified by a remarkable 
variety of alternative means. 

The formalism of the Witten integral implicates 
invariants of knots and links corresponding to each 
classical Lie algebra. In order to see this, we need to 
introduce the Wilson loop. The Wilson loop is an 
exponentiated version of integrating the gauge field 
along a loop K in three space that we take to be an 
embedding (knot) or a curve with transversal self- 
intersections. For this discussion, the Wilson loop 
will be denoted by the notation 


Wx(A) 


to denote the dependence on the loop K and the 
field A. It is usually indicated by the symbolism 


tr(Pefk " ). Thus, 
Wel) = ur(Pef ^) 


Here the P denotes path ordered integration — we 
are integrating and exponentiating matrix valued 
functions, and so must keep track of the order of the 
operations. The symbol tr denotes the trace of the 
resulting matrix. This Wilson loop integration exists 
by normal means and does not require functional 
integration. 

With the help of the Wilson loop functional on 
knots and links, Witten writes down a functional 
integral for link invariants in a 3-manifold M: 


Z(M,K) — | Dac (pef, " 


Here S(M, A) is the Chern-Simons Lagrangian, as in 
the previous discussion. We abbreviate S(M, A) as S 
and write Wy(A) for the Wilson loop. Unless 


otherwise mentioned, the manifold M will be the 
three-dimensional sphere $°. 

An analysis of the formalism of this functional 
integral reveals quite a bit about its role in knot 
theory. One can determine how the Witten integral 
behaves under a small deformation of the loop K. 


Theorem 


(i) Let Z(K) 2 Z9, K) and let 6Z(K) denote the 
change of Z(K) under an infinitesimal change in 
tbe loop K. Then 


6Z(K) = (47i/k) I dAe*/4™)S Vol] T, T, Wy (A) 


where Vol = e,4dx'dx?dx'. 

The sum is taken over repeated indices, and 
the insertion is taken of the matrices T,T, at the 
chosen point x on the loop K that is regarded 
as the center of the deformation. The volume 
element Vol = e,4,dx,dx,dx, is taken with regard 
to the infinitesimal directions of the loop 
deformation from this point on tbe original 
loop. 

(ii) The same formula applies, with a different 
interpretation, to the case where x is a double 
point of transversal self-intersection of a loop K, 
and the deformation consists in shifting one of 
the crossing segments perpendicularly to tbe 
plane of intersection so that the self-intersection 
point disappears. In this case, one T, is inserted 
into each of the transversal crossing segments so 
that T,T,Wx(A) denotes a Wilson loop with 
a self-intersection at x and insertions of Ta at 
x 4- e1 and x+ e», where eq and ey denote small 
displacements along tbe two arcs of K tbat 
intersect at x. In this case, the volume form is 
nonzero, with two directions coming from the 
plane of movement of one arc, and the perpen- 
dicular direction is the direction of the other arc. 


Remark One shows that the result of a topological 
variation has an analytic expression that is zero if 
the topological variation does not create a local 
volume. Thus, we have shown that the integral of 
ei£/47/5,) W (A) is topologically invariant as long as 
the curve K is moved by the local equivalent of 
regular isotopy. 


[n the case of switching a crossing, the key point is 
to write the crossing switch as a composition of first 
moving a segment to obtain a transversal intersec- 
tion of the diagram with itself, and then to continue 
the motion to complete the switch. Up to the choice 
of our conventions for constants, the switching 
formula is, as shown below (see Figure 12). 


Figure 12 The difference formula. 
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where K,, denotes the result of replacing the 
crossing by a self-touching crossing. We distinguish 
this from adding a graphical node at this crossing by 
using the double-star notation. 

A key point is to notice that the Lie algebra 
insertion for this difference is exactly what is done 
(in chord diagrams) to make the weight systems for 
Vassiliev invariants (without the framing compensa- 
tion). Thus, the formalism of the Witten functional 
integral takes one directly to these weight systems in 
the case of the classical Lie algebras. In this way, the 
functional integral is central to the structure of the 
Vassiliev invariants. 


The Loop Transform and Quantum 
Gravity 


Suppose that v(A) is a (complex-valued) function 
defined on gauge fields. Then we define formally the 
loop transform 7(K), a function on embedded loops 
in three-dimensional space, by the formula 


w(K) = [Davcaywel) 


If A is a differential operator defined on (A), then 
we can use this integral transform to shift the effect 
of A to an operator on loops via integration by 
parts: 


AK) = [DAS (A) Wi (A) 


" Í DAY(A)AWg(A) 


When A is applied to the Wilson loop, the result can 
be an understandable geometric or topological 
operation. One can illustrate this situation with 
operators G and H: 

mu ^d l a 


H = —e,Fi6/6A;6/6A! 
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with summation over the repeated indices. Each of 
these operators has the property that its action on 
the Wilson loop has a geometric or topological 
interpretation. One has 


Gy(K) = 6u(K) 


where this variation refers to the effect of varying K. 
As we saw in the previous section, this means that if 
w(K) is a topological invariant of knots and links, 
then Gy(K)=0 for all embedded loops K. This 
condition is a transform analog of the equation 
Gw(A) — 0. This equation is the differential analog 
of an invariant of knots and links. It may happen 
that óv(K) is not strictly zero, as in the case of our 
framed knot invariants. For example, with 


(A) = exp ( (ik/4n) [uta ^dA --(2/3)A ^A ^ A)) 


oan... 


we conclude that Gw(K) is zero for flat deformations 
(in the sense of the previous section) of the loop K, 
but can be nonzero in the presence of a twist or curl. 
In this sense, the loop transform provides a subtle 
variation on the strict condition Gy(A)=0. 

In Ashtekar et al. (1992) and other publications by 
Ashtekar, Rovelli, Smolin, and their colleagues, the 
loop transform is used to study a reformulation and 
quantization of Einstein gravity. The differential- 
geometric gravity theory of Einstein is reformulated 
in terms of a background gauge connection and in the 
quantization, the Hilbert space consists in functions 
(A) that are required to satisfy the constraints 
Gw=0 and Hy —0. Thus, we see that G(K) can be 
partially zero in the sense of producing a framed knot 
invariant, and that H(K) is zero for non-self- 
intersecting loops. This means that the loop trans- 
forms of G and H can be used to investigate a subtle 
variation of the original scheme for the quantization 
of gravity. This program is being actively pursued by 
a number of researchers. The Vassiliev invariants 
arising from a topologically invariant loop transform 
are of significance to this theory. 


Braiding, Topological Quantum Field 
Theory, and Quantum Computing 


The purpose of this section is to discuss in a very 
general way how braiding is related to topological 
quantum field theory and to the enterprise 
(Freedman et al. 2002) of using this sort of theory 
as a model for anyonic quantum computation. The 
ideas in the subject of topological quantum field 
theory are well expressed by Michael Atiyah (1990) 
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and Edward Witten (1989). The simplest case of this 
idea is C N Yang's original interpretation of the 
Yang-Baxter equation. Yang articulated a quantum 
field theory in one dimension of space and one 
dimension of time, in which the R-matrix giving the 
scattering amplitudes for an interaction of two 
particles whose (let us say) spins corresponded to 
the matrix indices so that RG is the amplitude for 
particles of spin a and spin b to interact and produce 
particles of spin c and d. Since these interactions are 
between particles in a line, one takes the convention 
that the particle with spin a is to the left of the 
particle with spin b, and the particle with spin c is to 
the left of the particle with spin d. If one follows the 
concatenation of such interactions, then there is an 
underlying permutation that is obtained by follow- 
ing strands from the bottom to the top of the 
diagram (thinking of time as moving up the page). 
Yang designed the Yang-Baxter equation for R so 
that the amplitudes for a composite process depend 
only on the underlying permutation corresponding 
to the process and not on the individual sequences of 
interactions. 

In taking over the Yang-Baxter equation for 
topological purposes, we can use the same inter- 
pretation, but think of the diagrams with their 
under- and over-crossings as modeling events in a 
spacetime with two dimensions of space and one 
dimension of time. The extra spatial dimension is 
taken in displacing the woven strands perpendicular 
to the page, and allows the use of braiding operators 
R and R^! as scattering matrices. Taking this picture 
to heart, one can add other particle properties to the 
idealized theory. In particular, one can add fusion 
and creation vertices where, in fusion, two particles 
interact to become a single particle and, in creation, 
one particle changes (decays) into two particles. 
Matrix elements corresponding to trivalent vertices 
can represent these interactions (see Figure 13). 

Once one introduces trivalent vertices for fusion 
and creation, there is the question how these 
interactions will behave in respect to the braiding 
operators. There will be a matrix expression for the 
compositions of braiding and fusion or creation as 
indicated. in Figure 14. Here we will restrict 
ourselves to showing the diagrammatics with the 
intent of giving the reader a flavor of these 


Figure 13 Creation and fusion. 


a 


Figure 14 Braiding. 


structures. It is natural to assume that braiding 
intertwines with creation as shown in Figure 15 
(similarly with fusion). This intertwining identity is 
clearly the sort of thing that a topologist will love, 
since it indicates that the diagrams can be inter- 
preted as embeddings of graphs in three-dimensional 
space. Figure 16 illustrates the Yang-Baxter equa- 
tion. The intertwining identity is an assumption like 
the Yang-Baxter equation itself, which simplifies the 
mathematical structure of the model. 

It is to be expected that there will be an operator 
that expresses the recoupling of vertex interactions 
as shown in Figure 17 and labeled by O. The actual 
formalism of such an operator will parallel the 
mathematics of recoupling for angular momentum 
(see, e.g., Kauffman (1994)). If one just considers 
the abstract structure of recoupling then one sees 
that for trees with four branches (each with a single 
root) there is a cycle of length 5, as shown in 


Y 


Figure 15 Intertwining. 
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Figure 16 Yang-Baxter equation. 
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Figure 17 Recoupling. 


Figure 18 Pentagon identity. 


Figure 17. One can start with any pattern of three 
vertex interactions and go through a sequence of five 
recouplings that bring one back to the same tree 
from which one started. It is a natural simplifying 
axiom to assume that this composition is the identity 
mapping. This axiom is called the pentagon identity 
(Figure 18). 

Finally, there is a hexagonal cycle of interactions 
between braiding, recoupling and the intertwining 
identity as shown in Figure 19. One says that the 
interactions. satisfy the hexagon identity if this 
composition is the identity. 

A three-dimensional topological quantum field 
theory is an algebra of interactions that satisfies the 
Yang—Baxter equation, the intertwining identity, the 
pentagon identity and the hexagon identity. There is 
no room in this summary to detail the way that 
these properties fit into the topology of knots and 
three-dimensional manifolds, but a sketch is in 
order. For the case of topological quantum field 
theory related to the group SU(2) there is a 
construction based entirely on the combinatorial 
topology of the bracket polynomial (see the section 
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Figure 19 Hexagon identity. 
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Figure 20 Decomposition of a surface into trinions. 


*Knots, braids, and bracket polynomial"). For more 
information on this approach, the reader is referred 
to Kauffman (1994, 2002). 

[t turns out that the algebraic properties of a 
topological quantum field theory give it enough 
power to rigourously model three manifold invar- 
iants described by the Witten integral. This is done 
by regarding the 3-manifold as a union of two 
handlebodies with boundary an orientable surface 
S, of genus g. The surface is divided up into 
trinions as illustrated in Figure 20. A trinion is a 
surface with boundary that is topologically equiva- 
lent to a sphere with three punctures. In Figure 20 
we illustrate two trinions, the second shown as a 
neighborhood of a trivalent vertex, and a surface 
of genus 3 that is decomposed into three trinions. 
It turns out that there is a way to associate a 
vector space V(S,) to a surface with a trinion 
decomposition, defined in terms of the associated 
topological quantum field theory, such that the 
isomorphism class of the vector space V(S,) does 
not depend upon the choice of decomposition. 
This independence is guaranteed by the braiding, 
hexagon, and pentagon identities in such a way 
that one can associate a well-defined vector |M,) in 
V(S,) whenever M is a 3-manifold whose boundary is 
Są. Furthermore, if a closed 3-manifold M? is decom- 
posed along a surface S, into the union of M_ and M4, 
where these parts are otherwise disjoint 3-manifolds 
with boundary S,, then the inner product I(M) = 
(M .|M,) is, up to normalization, an invariant of the 
3-manifold M3. With the definition of topological 
quantum field theory given above, knots and links can 
be incorporated as well, so that one obtains a source of 
invariants 1(M°,K) of knots and links in orientable 
3-manifolds. 

The invariant I(M?, K) can be formally compared 
with the Witten integral 


Z(M” K) = | DAe ASMA Wr, (A) 
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It can be shown that up to limits of the heuristics, 
Z(M, K) and I(MP?, K) are essentially equivalent for 
appropriate choice of gauge groups. 

This point of view leads to more abstract 
formulations of topological quantum field theories 
as ways to associate vector spaces and linear 
transformations to manifolds and cobordisms of 
manifolds. (A cobordism of surfaces is a 3-manifold 
whose boundary consists of these surfaces.) 

As the reader can see, a three-dimensional TQFT is, 
at base, a highly simplified theory of point-particle 
interactions in (2 + 1)-dimensional spacetime. It can be 
used to articulate invariants of knots and links and 
invariants of 3-manifolds. The reader interested in the 
SU(2) case of this structure and its implications for 
invariants of knots and 3-manifolds can consult 
Kauffman (1994, 2002) and Crane (1991). One expects 
that physical situations involving 2 + 1 spacetime will 
be approximated by such an idealized theory. It is 
thought, for example, that aspects of the quantum Hall 
effect will be related to topological quantum field 
theory (Wilczek 1990). One can imagine a physics 
where the geometrical space is two dimensional and the 
braiding of particles corresponds to their interactions 
through circulating around one another in the plane. 
Anyons are particles that do not just change their wave 
functions by a sign under interchange, but rather by a 
complex phase or even a linear combination of states. It 
is hoped that TQFT models will describe applicable 
physics. One can think about the possible applications 
of anyons to quantum computing. The TQFTs then 
provide a class of anyonic models where the braiding is 
essential to the physics and to the quantum 
computation. 

The key point in the application and relationship 
of TQFT and quantum information theory is, in our 
opinion, contained in the structure illustrated in 
Figure 21. There we show a more complex braiding 
operator, based on the composition of recoupling 
with the elementary braiding at a vertex. (This 
structure is implicit in the hexagon identity of 
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Figure 21 A more complex braiding operator. 


Figure 19.) The new braiding operator is a source of 
unitary representations of braid group in situations 
(which exist mathematically) where the recoupling 
transformations are themselves unitary. This kind of 
pattern is utilized in the work of Freedman et al. 
(2002) and in the case of classical angular momentum 
formalism has been dubbed a “spin-network quantum 
simulator" by Rasetti and collaborators (see, e.g., 
Marzuoli and Rasetti (2002). Kauffman and Lomo- 
naco (2006) show how certain natural deformations 
(Kauffman 1994) of Penrose (1969) spin networks can 
be used to produce such the Freedman-Kitaev model 
for anyonic topological quantum computation. It is 
legitimate to speculate that networks of this kind are 
present in physical reality. 

Quantum computing can be regarded as a study of 
the structure of the preparation, evolution, and 
measurement of quantum systems. In the quantum 
computation model, an evolution is a composition of 
unitary transformations (usually finite-dimensional 
over the complex numbers). The unitary transforma- 
tions are applied to an initial state vector that has been 
prepared prior to this process. Measurements are 
projections to elements of an orthonormal basis of 
the space upon which the evolution is applied. The 
result of measuring a state |i), written in the given 
basis, is probabilistic. The probability of obtaining a 
given basis element from the measurement is equal to 
the absolute square of the coefficient of that basis 
element in the state being measured. 

It is remarkable that the above lines constitute an 
essential summary of quantum theory. All applications 
of quantum theory involve filling in details of unitary 
evolutions and specifics of preparations and measure- 
ments. Such unitary evolutions can be seen as approxi- 
mated arbitrarily closely by representations of the Artin 
braid group. The key to the anyonic models of quantum 
computation via topological quantum field theory, or 
via deformed spin networks, is that all unitary evolu- 
tions can be approximated by a single coherent method 
for producing representations of the braid group. This 
beautiful mathematical fact points to a deep role for 
topology in the structure of quantum physics. 

The future of knots, links, and braids in relation 
to physics will be very exciting. There is no question 
that unitary representations of the braid group and 
quantum invariants of knots and links play a 
fundamental role in the mathematical structure of 
quantum mechanics, and we hope that time will 
show us the full meaning of this relationship. 
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Introduction 


The Kontsevich integral was invented by Kontsevich 
(1993) as a tool to prove the fundamental theorem of 
the theory of finite-type (Vassiliev) invariants (see Bar- 
Natan (1995a)). It provides an invariant exactly as 
strong as the totality of all Vassiliev knot invariants. 

The Kontsevich integral is defined for oriented 
tangles (either framed or unframed) in R^; therefore, 
it is also defined in the particular cases of knots, 
links, and braids (see Figure 1). 

As a starter, we give two examples where simple 
versions of the  Kontsevich integral have a 
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straightforward geometrical meaning. In these 
examples, as well as in the general construction of 
the Kontsevich integral, we represent 3-space R? as 
the product of a real line R with coordinate t and a 
complex plane C with complex coordinate z. 


Example 1 The number of twists in a braid with 
two strings z;(t) and z2(t) placed in the slice 0 < £t < 1 
(see Figure 2) is equal to 


1 1 dej — dz; 
271 0 A = 22 


Figure 1 A tangle, a braid, a link, and a knot. 
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z(t)&- - - p2a(t) 


Figure 2 Counting the number of twists. 


Figure 3 Counting the linking number. 


Example 2 The linking number of two spatial 
curves K and K' (see Figure 3) can be computed as 


| d(zj(£) — z;(t)) 


l 
Ik(K, K' xi — Da 
l = Ami m<t<M j 1 z(t) v i 


z(t) 


where m and M are the minimum and the maximum 
values of t on the link KU K', j is the index that 
enumerates all possible choices of a pair of strands 
of the link as functions z;(£), z;(¢) corresponding to K 
and K', respectively, and e; — +1 according to the 
parity of the number of chosen strands that are 
oriented downwards. 

The Kontsevich integral can be regarded as a far- 
going generalization of these formulas. It aims at 
encoding all information about how the horizontal 
chords on the knot (or tangle) rotate when moved in 
the vertical direction. From a more general view- 
point, the Kontsevich integral represents the mono- 
dromy of the Knizhnik-Zamolodchikov connection 
in the complement to the union of diagonals in C" 
(see Bar-Natan (1995a) and Ohtsuki (2002)). 


Chord Diagrams and Weight Systems 
Algebras .A(p) 


The Kontsevich integral of a tangle T takes values in 
the space of chord diagrams supported on T. 

Let X be an oriented one-dimensional manifold, 
that is, a collection of p numbered oriented lines and 
q numbered oriented circles. A chord diagram of 
order n supported on X is a collection of » pairs 
of unordered points in X, considered up to an 
orientation- and  component-preserving — diffeo- 
morphism. In the vector space formally generated 
by all chord diagrams of order n, we distinguish the 
subspace spanned by all four-term relations 


where thin lines designate chords, while thick lines are 
pieces of the manifold X. Apart from the fragments 
shown, all the four diagrams are identical. The 
quotient space over all such combinations is denoted 
by An(X)=An(p,g). Let A(p,q)= BF An(p, q) 
and let A(p,q) be the graded completion of A(p,q) 
(i.e., the space of formal infinite series $77 ġa; with 
a; € Aj(p,q)). If, moreover, we divide A(p,q) by all 
“framing independence” relations (any diagram with 
an isolated chord, i.e., a chord joining two adjacent 
points of the same connected component of X, is set to 
0), then the resulting space is denoted by .A'(p, q), and 
its graded completion by .A'(p, q). 

The spaces .A(p, 0) —.A(p) have the structure of an 
algebra (the product of chord diagrams is defined by 
concatenation of underlying manifolds in agreement 
with the orientation). Closing a line component into a 
circle, we get a linear map .A(p, q) > Alp — 1,q + 1) 
which is an isomorphism when p= 1. In particular, 
A(S!) = A(R!) has the structure of an algebra; this 
algebra is denoted simply by A; the Kontsevich integral 
of knots takes its values in its graded completion 
A. Another algebra of special importance is 
(3) = A(3,0), because it is where the Drinfeld 


associators live. 


Hopf Algebra Structure 


The algebra A(p) has a natural structure of a Hopf 
algebra with the coproduct ó defined by all ways to 
split the set of chords into two disjoint parts. To give 
a convenient description of its primitive space, one 
can use generalized chord diagrams. We now allow 
trivalent vertices not belonging to the supporting 
manifold and use STU relations (Bar-Natan 19952) 


wher * dle "XL 


to express the generalized diagrams as linear combi- 
nations of conventional chord diagrams, for example, 


9-0-0-9 


Then the primitive space coincides with the sub- 
space of A(p) spanned by all connected generalized 
chord diagrams (*connected" means that they remain 
connected when the supporting manifold X is 
disregarded). 


Weight Systems 


A “weight system" of degree n is a linear function 
on the space A,. Every Vassiliev invariant v of 
degree n defines a weight system symb(v) of the 
same degree called its “symbol.” 


Algebras 5(p) 


Apart from the spaces of chord diagrams modulo four- 
term relations, there are closely related spaces of Jacobi 
diagrams. A Jacobi diagram is defined as a unitrivalent 
graph, possibly disconnected, having at least one 
vertex of valency 1 in each connected component and 
supplied with two additional structures: a cyclic order 
of edges in each trivalent vertex and a labeling of 
univalent vertices taking values in the set {1,2,..., p}. 
The space B(p) is defined as the quotient of the vector 
space formally generated by all p-colored Jacobi 
diagrams modulo the two types of relations: 
Antisymmetry: IH X: 


-Y DHX 


The disjoint union of Jacobi diagrams makes the 
space B(p) into an algebra. 

The symmetrization map yp: B(p) — .A(p), defined 
as the average over all ways to attach the legs of color i 
to ith connected component of the underlying manifold 


27 ?2 H«M 
1 
VER MS 12 +t € 


is an isomorphism of vector spaces (the formal 
PBW isomorphism (Bar-Natan 1995a, Le and 
Murakami 1995) which is not compatible with 
the multiplication. The relation between A(p) and 
B(p) very much resembles the relation between 
the universal enveloping’ algebra and the sym- 
metric algebra of a Lie algebra. The algebra 
B= B(1) is used to write out the explicit formula 
for the Kontsevich integral of the unknot (see 
Bar-Natan et al. (2003) and below). 


The Construction 
Kontsevich's Formula 


We will explain the construction of the Kontsevich 
integral in the classical case of (closed) oriented 
knots; for an arbitrary tangle T, the formula is the 
same; only the result is interpreted as an element of 
A(T). As above, represent three-dimensional space 
R? as a direct product of a complex line C with 
coordinate z and a real line R with coordinate t. 
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The integral is defined for Morse knots, that is, 
knots K embedded in R? — C, x R, in such a way 
that the coordinate ¢ restricted to K has only 
nondegenerate (quadratic) critical points. (In fact, 
this condition can be weakened, but the class of 
Morse knots is broad enough and convenient to 
work with.) 

The Kontsevich integral Z(K) of the knot K is the 
following element of the completed algebra A’: 


-- 
Z(K) = > aay 


m=0 


A has SQ € B 
| tas 2 [4 


t; are noncritical 


" dz; — dz; 
x Dp A—— = 
: Ay 


jx j 


Explanation of the Constituents 


The real numbers tmin and tmax are the minimum and 
the maximum of the function £ on K. 

The integration domain is the m-dimensional 
simplex fmin € tn < = < ft < fmax divided by the 
critical values into a certain number of *connected 
components." For example, Figure 4 shows an 
embedding of the unknot where, for m=2, the 
integration domain has six connected components. 

The number of summands in the integrand is 
constant in each connected component of the 
integration domain, but can be different for different 
components. In each plane (t—1;] C R^ choose an 
unordered pair of distinct points (z;, tj) and (27, t;) on 
K, so that z;(t;) and z;(tj) are continuous branches of 
the knot. We denote by P = ((2;, z;)] the collection of 
such pairs for j — 1,...,72. The integrand is the sum 
over all choices of the pairing P. In the example 
above for the component {tmin < ti < Ci, C2 < to < 
lmax], we have only one possible pair of points on 
the levels (£— £j] and {t=f)}. Therefore, the sum 
over P for this component consists of only one 
summand. Unlike this, in the component {tnin < 
ti < ci, ci €t; <c}, we still have only one 


Figure 4 Connected components. 
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possibility for the level [t — £j], but the plane [t= t2} 
intersects our knot K in four points. So we have 
(2) =6 possible pairs (22, 25), and the total number 
of summandis is six (see Figure 5). 

For a pairing P, the symbol “|p” denotes the 
number of points (z;,t;) or (2j, t; in P, where the 
coordinate t decreases along the orientation of K. 

Fix a pairing P. Consider the knot K as an oriented 
circle and connect the points (z;, 4j) and (z;,t;) by a 
chord. Up to a diffeomorphism, this chord does not 
depend on the value of ¢ within a connected 
component. We obtain a chord diagram with m 
chords. The corresponding element of the algebra .A' 
is denoted by Dp. Figure 5, for each connected 
component in our example, shows one of the possible 
pairings, the corresponding chord diagram with 
the sign (—1);, and the number of summands of the 
integrand (some of which are equal to zero in A’ due 
to the framing independence relation). 

Over each connected component, z; and z; are 
smooth functions of fj. 

by 


we mean the pullback of this form to the integration 
domain of variables ¢,...,t,. The integration 
domain is considered with the orientation of the 
space R" defined by the natural order of the 
coordinates £1,..., tmn- 

By convention, the term in the Kontsevich integral 
corresponding to 7: — 0 is the (only) chord diagram 
of order 0 with coefficient 1. It represents the unit of 
the algebra A’. 


QS 


36 summands 


a | 
Sgt " 7) () 


Framed Version of the Kontsevich Integral 


Let K be a framed oriented Morse knot with writhe 
number w(K). Denote the corresponding knot 
without framing by K. The framed version of the 
Kontsevich integral can be defined by the formula 


Z'"(K) - e" (0/29, 7 (tK) € A 


where © is the chord diagram with one chord and the 
integral Z(K) € A’ is understood as an element of the 
completed algebra A (without one-term relations) by 
virtue of a natural inclusion .A' — .A defined as identity 
on the primitive subspace of A’ (see Goryunov 
(1999) and Le and Murakami (1996)). 


Basic Properties 
Constructing the Universal Vassiliev Invariant 
The Kontsevich integral Z(K) 


1. converges for any Morse knot K, 

2. is invariant under deformations of the knot in the 
class of Morse knots, and 

3. behaves in a predictable way under the deforma- 
tion that adds a pair of new critical points to a 
Morse knot: 


Z AW = Z(H) .Z A 


Here the first and the third pictures depict two 
embeddings of an arbitrary knot, differing only in 
the shown fragment, H =| is the “hump” (unknot 
embedded in R? in the specified way), and the 


product is the product in the completed algebra A’ 


EL 


1 summand 


1 summand 


Figure 5 Pairings and chord diagrams. 
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of chord diagrams. The last equality allows one to 
define a genuine knot invariant by the formula 


I(K) = Z(K)/Z(Hy'^* 


where c denotes the number of critical points of K and 
the ratio means the division in the algebra .A' according 
to the rule (1 +a =1- a-F-2— à? 4----. 

The expression /(K) is sometimes referred to as 
the "final" Kontsevich integral as opposed to the 
“preliminary” Kontsevich integral Z(K). It repre- 
sents a universal Vassiliev invariant in the following 
sense: Let w be a weight system, that is, a linear 
functional on tbe algebra A’. Then the composition 
w(I(K)) is a numerical Vassiliev invariant, and any 
Vassiliev invariant can be obtained in tbis way. 

The final Kontsevich integral for framed knots is 
defined in the same way, using the hump H with 
zero writhe number. 


Is Universal Vassiliev Invariant Universal? 


At present, it is not known whether the Kontsevich 
integral separates knots, or even if it can tell the 
orientation of a knot. However, the corresponding 
problem is solved, in the affirmative, in the case of 
braids and string links (theorem of Kohno- 
Bar-Natan (Bar-Natan 1995b, Kohno 1987). 


Omitting Long Chords 


We will state a technical lemma which is highly 
important in the study of the Kontsevich integral. It 
is used in the proof of the multiplicativity, in the 
combinatorial construction, etc. 

Suppose we have a Morse knot K with a 
distinguished tangle T (Figure 6). Let m and M be 
the maximal and minimal values of t on the tangle T. 
In the horizontal planes between the levels »; and M, 
we can distinguish two kinds of chords: "short" 
chords that lie either inside T or inside K VT, and 
“long” chords that connect a point in T with a point 
in K \T. Denote by Z7(K) the expression defined by 
the same formula as the Kontsevich integral Z(K) 
where only short chords are taken into consideration. 
More exactly, if C is a connected component of the 


Figure 6 Short and long chords. 
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integration domain whose projection on the coordi- 
nate axis f; is entirely contained in the segment [77, M], 
then in the sum over the pairings P we include only 
those pairings that include short chords. 


Lemma “Long” chords can be omitted when 
computing the Kontsevich integral: Zr(K) — Z(K). 


Kontsevich's Integral and Operations on Knots 


The Kontsevich integral behaves in a nice way with 
respect to the natural operations on knots, such as 
mirror reflection, changing the orientation of the 
knot, mutation of knots (see Chmutov and Duzhin 
(2001)), cabling (see Willerton (2002)). We give 
some details regarding the first two items. 


Fact 1 Let R be the operation that sends a knot 
to its mirror image. Define the corresponding 
operation R on chord diagrams as multiplication 
by (— 1)", where z is the order of the diagram. Then 
the Kontsevich integral commutes with the opera- 
tion R: Z(R(K)) - R(Z(K)), where by R(Z(K)) we 
mean simultaneous application of R to all the chord 
diagrams participating in Z(K). 


Corollary The Kontsevich integral Z(K) and the 
universal Vassiliev invariant I(K) of an amphicheiral 
knot K consist only of even order terms. (A knot K is 
called “amphicheiral,” if it is equivalent to its mirror 
image: K — R(K).) 


Fact 2 Let S be the operation on knots which 
inverts their orientation. The same letter will also 
denote the analogous operation on chord diagrams 
(inverting the orientation of the outer circle or, 
which is the same thing, axial symmetry of the 
diagram). Then the Kontsevich integral commutes 


with the operation S of inverting the orientation: 
Z(S(K)) = S(Z(K)). 


Corollary The 
equivalent: 


following | two assertions are 


(i) Vassiliev invariants do not distinguish tbe 
orientation of knots and 

(ii) all chord diagrams are symmetric: D=S(D) 
modulo four-term relations. 


The calculations of Kneissler (1997) show that up 
to order 12 all chord diagrams are symmetric. For 
bigger orders, the problem is still open. 


Multiplicative Properties 
The Kontsevich integral for tangles is multiplicative: 
Z(T3) + Z(T2) = Z(Ti + T2) 


whenever the product Tı - Tə, defined by vertical 
concatenation of tangles, exists. Here, the product 
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on the left-hand side is understood as the image of 
the element Z(Tj) 9 Z(T2) under the natural map 
A(T1) & A(T2) > A(T; - T2). 

This simple fact has two important corollaries: 


1. For any knot K, the Kontsevich integral Z(K) is 
a group-like element of the Hopf algebra A’, 
that is, 


6(Z(K)) = Z(K) & Z(K) 


where ó is the comultiplication in A defined 
above. 

2. The final Kontsevich integral, taken in a different 
normalization 


Z(K) 


F(K) = ZANR =-= 
(K) = ZAK) = Fa 


is multiplicative with respect to the connected 
sum of knots: 


I'(Ki#K2) = I'(Ky)I'(K2) 


Arithmetical Properties 


For any knot K the coefficients in the expansion of 
Z(K) over an arbitrary basis consisting of chord 
diagrams are rational (see Kontsevich (1993), Le 
and Murakami (1996), and below). 


Combinatorial Construction of the 
Kontsevich Integral 


Sliced Presentation of Knots 


The idea is to cut the knot into a number of 
standard simple tangles, compute the Kontsevich 
integral for each of them, and then recover the 
integral of the whole knot from these simple 
pieces. 

More exactly, we represent the knot by a family 
of plane diagrams continuously depending on a 
parameter ¢€(0,¢9) and cut by horizontal planes 
into a number of slices with the following 
properties. 


1. At every boundary level of a slice (dashed lines 
in the pictures below), the distances between 
various strings are asymptotically pro- 
portional to different whole powers of the 
parameter €. 

2. Every slice contains exactly one special event 
and several strictly vertical strings which 
are farther away (at lower powers of £) from 
any string participating in the event than its 


width. 


3. There are three types of special events: 


min/max: m=.. M= LE 


braiding: 


associativity: A= || a= MI 
where, in the two last cases, the strings may be 
replaced by bunches of parallel strings which 


are closer to each other than the width of this 
event. 


Recipe of Computation of the Kontsevich Integral 


Giver such a sliced representation of a knot, the 
combinatorial algorithm to compute its Kontsevich 
integral consists in the following: 


1. Replace each special event by a series of chord 
diagrams supported on the corresponding tangle 
according to the rule 

m,M 1 
B,eR. BureR~ 
Ayr, Aw ob! 


where 
n= Howl 


Lao ds E x 
=e tet EP EEUU 


SC ta p) 
Tl) 


¢(3) 
(2mi)° 


( e A(3) is the  Knizhnik-Zamolodchikov 
Drinfeld associator defined below; it is an infinite 
series in two variables a = #1,b=1#). 

2. Compute the product of all these series from 
top to bottom taking into account the connec- 
tion of the strands of different tangles, thus 
obtaining an element of the algebra A’. 


([a, [a, b] + [b, [a, b]]) + --- 


To accomplish the algorithm, we need two 
auxiliary operations on chord diagrams: 


1. Sj:.A(p)—.A(p) defined as multiplication by 
(-1) on a chord diagram containing k end- 
points of chords on the string number ;. This is 
the correction term in the computation of R 
and 4 in the case when the tangle contains 


some strings oriented downwards (the upwards 
orientation is considered as positive). 

2. Aj:.A(p) — Alp +1) acts on a chord diagram D 
by doubling the ;th string of D and taking the 
sum over all possible lifts of the endpoints of 
chords of D from the ith string to one of the two 
new strings. The strings are counted by their 
bottom points from left to right. This operation can 
be used to express the combinatorial Kontsevich 
integral of a generalized associativity tangle 
(with strings replaced by bunches of strings) in 
terms of the combinatorial Kontsevich integral 
of a simple associativity tangle. 


Example 


Using the combinatorial algorithm, we compute the 
Kontsevich integral of the trefoil knot 3, to the 
terms of degree 2. A sliced presentation for this knot 
shown in Figure 7 implies that Z(31) — S3(®) 
R2S;($) (here the product from left to right 
corresponds to the multiplication of tangles from 
top to bottom). Up to degree 2, we have 


$ —1-4 la, b| + 


AK uz X(1+ la 4- la* +- 2 


where X means that the two strands in each term of 
the series must be crossed over at the top. The 
operation $3 changes the orientation of the third 
strand, which means that $3(a) —a and $3(b)= — b. 
Therefore, 


and 


Figure 7 A sliced presentation of the trefoil. 
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Z(31) = (1 — zia. b] + ---) 
x X(1—3a--2a^ ---J(43 4 4; [a.b] +---) 
—1-3Xa — 4 abX + X baX 
t 54 Xab — +, Xba + 3 Xa’ Tee 


Closing these diagrams into the circle, we see that in 
the algebra A we have Xa-—0 (by the framing 
independence relation), then baX = Xab — 0 (by the 
same relation, because these diagrams consist of two 
parallel chords) and abX = Xba = Xa? = Q. The 
result is Z(3íi)=1 + (25/24) G9 ----. The final 
Kontsevich integral of the trefoil (in the multi- 
plicative normalization) is thus equal to 


Drinfeld Associator and Rationality 


The Drinfeld associator used as a building block in 
the combinatorial construction of the Kontsevich 
integral can be defined as the limit 


yz = lime "ZLATE? 


where a — ui, b — 14, and AT- is the positive associa- 
tivity tangle (special event A, shown above) with 
the distance between the vertical strands constant 1 
and the distance between the close endpoints equal 
to £. An explicit formula for kz was found by Le 
and Murakami (1996); it is written as a nested 
summation over four variable multi-indices and 
therefore does not provide an immediate insight 
into the structure of the whole series; we confine 
ourselves by quoting the beginning of the series 
(note that ®xz is a group-like element in the free 
associative algebra with two generators; hence, its 
logarithm belongs to the corresponding free Lie 
algebra): 


log($kz) = — ¢(2)[x,¥] — ¢(3)([x, [x,y] + Ds [e 9) 
- FN x, bell] b» Bes b 
+4ly, Ly, Ix. y]D) 
= Y iss all + D b bs be 
+ (6(2)6(3) - 2) br bes x, xs] 
+ D bye, oT 
+ GCN) —3¢(5)) [lx » bes [x,y] 


AS 
+ (5¢(2)¢(3) - 3¢(5)) [f 9], by. [x 1] 
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where x — (1/2z1)a and y — (1/271)b. In general, yz 
Is an infinite series whose coefficients are. *multiple 
zeta values" (Le and Murakami 1996, Zagier 1994) 


(a.ssum)m EO Kiek” 


Ock, ck» cech, 


There are other equivalent definitions of yz, in 
particular one in terms of the asymptotical behavior 
of solutions of the simplest Knizhnik-Zamolodchikov 


equation 
dG a b 
= |-+— JG 
dz (E+) 


where G is a function of a complex variable taking 
values in the algebra of series in two noncommuting 
variables a and b (see Drinfeld (1991)). 

It turns out (theorem of Le and Murakami (1996 )) 
that the combinatorial Kontsevich integral does not 
change if yz is replaced by another series in A(3) 
provided it satisfies certain axioms (among which 
the pentagon and hexagon relations are the most 
important, see Drinfeld (1991) and Le and 
Murakami (1996)). 

Drinfeld (1991) proved the existence of an 
associator o with rational coefficients. Using it 
instead of yz in the combinatorial construction, we 
obtain the following: 


Theorem (Le and Murakami 1996). The coeffi- 
cients of the Kontsevich integral of any knot (tangle) 
are rational when Z(K) is expanded over an 
arbitrary basis consisting of chord diagrams. 


Explicit Formulas for the Kontsevich 
Integral 


The Wheels Formula 


Let O be the unknot; the expression I(O) = Z(H) ' 
is referred to as the “Kontsevich integral of the 
unknot.” A closed form formula for I(O) was 
proved in Bar-Natan et al. (2003): 


Theorem 


I(O) = exp » Dinn 
m=i 


, 


=1+ b> batt + ; (X batt oy 
n=| n=] 


Here b>, are modified Bernoulli numbers, that is, 
the coefficients of the Taylor series 


(b; = 1/48, bg = — 1/5760, bę = 1/362 880,...), and 
Ww, are the “wheels,” that is, Jacobi diagrams of the 


form 
wW zs. W4 af LG ea: 


The sums and products are understood as operations 
in the algebra of Jacobi diagrams B, and the result is 
then carried over to the algebra of chord diagrams .A 
along the isomorphism x. 


Generalizations 


There are several generalizations of the wheels 
formula. 


1. Rozansky's rationality conjecture (Rozansky 
2003) proved by Kficker (2000) affirms that the 
Kontsevich integral of any (framed) knot can be 
written in a form resembling the wheels formula. 
Let us call the “skeleton” of a Jacobi diagram the 
regular 3-valent graph obtained by “shaving off” 
all univalent vertices. Then the wheels formula 
says that all diagrams in the expansion of [(O) 
have one and the same skeleton (circle), and the 
generating function for the coefficients of dia- 
grams with n legs is a certain analytic function, 
more or less rational in e*. In the same way, the 
theorem of Rozansky and Kricker states that the 
terms in /(K) € B, when arranged by their 
skeleta, have the generating functions of the 
form p(e*)/Ax(e*), where Ax is the Alexander 
polynomial of K and p is some polynomial 
function. Although this theorem does not give 
an explicit formula for I(K), it provides a lot of 
information about the structure of this series. 

2. Marché gives a closed form formula for the 
Kontsevich integral of torus knots T(p, q). 


The formula of Marché, although explicit, is 
rather intricate, and here, by way of example, we 
only write out the first several terms of the final 
Kontsevich integral I’ for the trefoil (torus knot of 
type (2,3)), following Willerton (2002): 


— 2031. 5 1 
I (&) =0 -8 +8- 548+ 349 t 9 


First Terms of the Kontsevich Integral 


A Vassiliev invariant v of degree m is called 
"canonical" if it can be recovered from the 
Kontsevich integral by applying a homogeneous 
weight system, that is, if v — symb(v) o I. Canonical 
invariants define a grading in the filtered space of 
Vassiliev invariants which is consistent with the 
filtration. If the Kontsevich integral is expanded 
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over a fixed basis in the space of chord diagrams A’, 
then the coefficient of every diagram is a canonical 
invariant. According to Stanford (2001) and Willerton 
(2002), the expansion of the final Kontsevich integral 
up to degree 4 can be written as follows: 


l'(K) = O -e(K) @ - §/3(K) 8 
+ ds( 4j4(K) + 36c4(K) — 3665 (K) + 3e2(K))® 


where c, are coefficients of the Conway polynomial 
Vx(t) — MY c,(K)t" and j, are modified coefficients of 
the Jones polynomial Jx(e') = X` ;,(K)t". Therefore, up 
to degree 4, the basic canonical Vassiliev invariants of 
unframed knots are c», /3, j4, c4 + (1/12)c2, and EN 
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Modulation equations are simplified equations 
used to model complicated physical systems. Typi- 
cally they are derived from the fundamental partial 
differential equations that describe the system via 
asymptotic analysis. Furthermore, the modulation 
equations are in a sense “universal” in that many 


different physical systems are described by the same 
modulation equation. This comes about because 
the form of the modulation equation depends on 
only a very few, qualitative features of the original 
partial differential equation. Thus, they serve a sort 
of “normal form” for these partial differential 
equations and as such justify greater study than 
their apparently special character might otherwise 
merit. 
The Korteweg-de Vries (KdV) equation 


Qu = u+ 6ud,u, u=u(x,t), ER, t>0 [1] 
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was one of the earliest modulation equations to be 
intensively studied. It was derived in an attempt to 
understand the propagation of solitary waves on the 
surface of water in a channel of finite depth. The 
KdV equation was first derived by Boussinesq but 
then independently rederived and studied in detail 
by Korteweg and de Vries. (For an interesting 
discussion of the early history of the KdV equation 
see Pego and Weinstein (1997).) 


Derivation of the KdV Equation 


As mentioned above, the KdV equation is a sort of 
normal form describing the propagation of small- 
amplitude, long-wavelength disturbances in a variety 
of different physical systems. In this section we 
describe in detail how it arises as an approximation 
to the Fermi-Pasta-Ulam (FPU) model of coupled, 
nonlinear oscillators. Although the KdV equation is 
most commonly encountered as an approximation 
to water waves, its study as an approximation to the 
FPU model was extremely important historically 
because it was in this context that its complete 
integrability was discovered by Miura (1968) and 
Gardner et al. (1974). 

Consider an infinite set of particles of mass 
m=1 at positions qj;(t), € Z, interacting with 
their nearest neighbors via a potential Y(q). 
Newton's equations for the motion of such 


particles are: 
dq; / 
ae =V (qj+i(t) — q;(t)) 


= V (gH — qj-1(£)), 


If we rewrite these equations in terms of the 
difference variables r(j,t)=q;+ı(t) — q;(t), then [2] 
becomes 


jEZ [2] 


2 
Ea i 
ag VU —Y'(r( + 1,1)) 
+ Vi (rj —1,2) -2V'(r(.2), jez [3| 
We are interested in small-amplitude, long- 


wavelength, solutions of [3]. One way of studying 
such motions is to change the lattice spacing in [3] 
from 1 to h and then let hb tend to zero. A nice 
derivation of the KdV equation from that point of 
view is contained in Ablowitz and Segur (1981). 
Here, following Schneider and Wayne (1999), we 
will keep the lattice spacing fixed at 1 and rescale 
the spatial variable in the KdV equation. This is 
closer to the approximation method used in the 
water wave problem. 

Since we want to focus on small-amplitude, long- 
wavelength solutions of [3], we begin by making the 


hypothesis that there exists some real-valued func- 
tion R(x,£) such that the solution of [3] can be 
written as 


r(j,t) = € R(gj.t) [4] 


The prefactor £4 insures that the solution is of small 
amplitude while rescaling j — sj means that phe- 
nomena that occur on length scales of O(1) in the 
equation for R will occur on length scales of O(1/z) 
in the original equation — that is, they will be long- 
wavelength solutions. The differing powers of £ 
chosen for rescaling the amplitude and the spatial 
scale are chosen so that the dispersive and nonlinear 
effects will balance each other. Inserting [4] into [3] 
and expanding to lowest order in £ we find that the 
nonvanishing terms of lowest order in £ are 


ƏR 


O^R 
Ox? 5 


—— = ¢ "(0 
Ot ve) 


This is just the wave equation and thus to leading 
order we expect solutions of [3] to split into a left- 
and right-moving waves, each moving with speed 
c-=6£,/V"(0). (We assume that c? = Y"(0) > 0.) 
Thus, we make a refinement of the hypothesized 
form of the solution and replace |4] by 


r(j,t) =e*U(e(j + ct), e t) 


+ &?V(e(j — ct), &?t) + e* (ej, et) [3 


The presence of the term sfp may be somewhat 


surprising. We will discuss the reason for its 
appearance in more detail below, but for the 
moment we mention merely that its presence does 
not affect the fact that to leading order the solution 
is approximated by the left- and right-moving waves 
represented by the e7U and &?V terms, respectively. 
We also note that the additional time dependence 
&t in U and V is chosen, as is typical in the 
multiscale method to incorporate the higher-order 
terms omitted in [5] into the evolution. 

Substituting [6] into [3] and expanding the 
resulting equation in £ we find that the lowest 
order in £ that occurs is O(<¢*) and these terms all 
cancel exactly because of the form of our hypothe- 
sized solution. The terms of ÓO(c*) are: 


[2c0xOprU — 2cOxOT V + Op} 
= e (3,0$U +404 V+ ae) 
-1y"(040£(U?-- V?--2UV)) [7] 


Here, X, T,£, and 7 represent the rescaled indepen- 
dent variables, that is, U— U(X, T), V= V(X, T), 
and p= v(£, 7). 
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Note that if it were not for the presence of the term 
2UV on the right-hand side of this last equation the 
equations for U and V would completely decouple, 
that is, there would be no interaction between the 
left- and right-moving parts of the solution to this 
order. At this point, we can take advantage of the 
(heretofore) arbitrary function y. If we assume that U 
and V are given, we can choose y to satisfy the 
inhomogeneous wave equation: 


Op = dep + Y" (0)0s (UV) [8] 


Then, provided remains of O(1) over the time- 
scales of interest (which one can verify a posteriori), 
we see that all terms of O(c*) in the expansion of [3] 
will vanish provided 


20:U = £ 8U + Lv" (0)ay(U?) 
12 Ac (9) 

4: _ © 93 LETT 2 

20rV = 129x V * 3: V (O)Ox(V~) 


This means that the left- and right-moving parts of the 
solution satisfy a pair of uncoupled KdV equations. 


Remark 1 To rewrite [9] in the standard form [1] 
one can make a simple rescaling — for instance, 
choose X=ax,T=t and u(x,t)=GU(ax,t), with 
a — (c/24)!/? and 3— Y"(0)/(12ca). 


We can now comment on the reasons we chose 
the particular scalings of the amplitude and of the 
independent variables used in [6]. The terms 0$ U? 
and O$V^ are the lowest-order contributions from 
the nonlinear part of [3], while the terms 0{U and 
OVV represent the lowest-order contributions from 
the linear part of the equation, except for the 
"trivial" translation that comes for [5]. In particular, 
in the absence of nonlinear effects the terms ð$ U 
and ð$ V (or equivalently, the terms ðU and ð V in 
[9]) would cause traveling waves to “disperse” and 
thus, the KdV equation represents a balance 
between nonlinear and dispersive effects. It is this 
balance between dispersion and nonlinearity which 
permits traveling-wave solutions to propagate with- 
out change of form (see the section “Integrability of 
the KdV equation”). 

More generally, we expect the KdV equation to 
arise as a modulation equation whenever a small- 
amplitude, long-wavelength linear wave is simulta- 
neously perturbed by dispersive and nonlinear 
effects of the same order of magnitude. This is, of 
course, oversimplified. For instance, the original 
equation may have no quadratic terms in the 
nonlinearity, for instance, which means that the 
term OxU^ in the modulation equation will be 
replaced by a term like OxU?, for p>2 — this 


leads to the modified KdV equation as the appro- 
priate modulation equation. Or, for certain para- 
meter values in the original equation the coefficient 
in front of the leading-order dispersive term may 
vanish, in which case a fifth-order modulation 
equation known as the Kawahara equation is more 
appropriate. However, both of these cases are in 
some sense nongeneric and the relatively weak 
hypotheses needed to obtain the KdV equation as 
the appropriate modulation equation indicate why it 
is encountered in so many diverse circumstances. We 
note, however, that the multiscale method used 
above to derive the KdV equation does not give a 
unique choice for the appropriate modulation 
equation at any given order of approximation and 
we discuss in a later section some other equations 
that could be used as models in the situation above. 


Validity of the KdV Approximation 


While the above derivation of the KdV equation is 
simple and intuitive one may wonder how accurate 
an approximation it actually provides to the true 
solutions of [3] (or to the evolution of water waves, 
probably the most important physical situation in 
which the KdV approximation is used). In particu- 
lar, note that in the notation of [9] the phenomena 
intrinsic to the KdV equation occur on timescales 
T —O(1). However, this corresponds to a very 
long timescale t=QO(1/e*) in the original FPU 
model and it could easily be the case that although 
the error made in derivation of the KdV approx- 
imation at any given time is quite small, over these 
very long timescales the errors could accumulate 
in such a way as to destroy the accuracy of the 
approximation. 

The KdV and other modulation equations have 
been used since the nineteenth century but only 
relatively recently have rigorous estimates of the 
accuracy of this approximation been proved. In 
fact, the first estimates demonstrating that the 
KdV equation actually provided an accurate 
approximation to the true motion of water 
waves over the timescales expected from the 
heuristic derivation were not proved until Craig 
(1985). More recently, powerful general methods 
have been developed to justify not just the KdV 
equation but other modulation equations like the 
nonlinear Schródinger equation and Ginzburg- 
Landau equation as well. 

For instance, the following method, introduced 
in Kirrmann et al. (1992), has been used to justify 
the use of modulation equations in the water-wave 
problem, the evolution of Taylor-Couette patterns 
in viscous fluids, and a number of other 
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circumstances. We will explain it in the context of 
a general, abstract evolution equation to indicate 
its generality. Suppose that one wishes to approx- 
imate the small-amplitude solutions of a general 
evolution equation (or system of such equations) of 
the form 


Ou = Lu + N (u) [10] 


where L is a linear operator and N represents the 
nonlinear terms. Suppose that via some formal 
analysis like that in the previous section we have 
derived a function £^ that is believed to be a good 
approximation to a true solution of [10]. In that 
example, for instance, ¢*y) would be the sum of the 
solutions of the two KdV equations in [9], and in 
general it will be given by the solution of the 
modulation equation that is expected to approxi- 
mate [10]. We must show that the difference 
between c^» and a true solution of [10] remains 
small over the timescales of interest. We write this 
difference as u — £^v — e?R so that if 8 > 2, and if 
R=QO(1),e*u does provide the  leading-order 
approximation to the true solution. We can make 
R|,.g small by choosing the initial conditions of 
our modulation equation appropriately and thus 
we need to follow how R evolves in time. If we use 
the equation satisfied by u we see that R evolves as 


OR =LR + Ee” [N (e^v + eR) 
-N (&^v)] + & "Res(e^v) [11] 


where Res(c^v) = L(e^v) +N (ew) — âle), the 
"residual" of our approximation is simply the 
amount by which the approximation fails to satisfy 
the original equation at any given time. In the 
example in the previous section the residual would 
include the terms O(e%) that we ignored in our 
expansion. 

One must now, in any given example consider 
three points: 


1. The linear evolution of R: 
Q,R = LR + DN (&^v)R [12] 


Controlling the solutions of this linear, but 
nonconstant coefficient partial differential equa- 
tion is often the most difficult step in proving 
that solutions of the modulation equation give 
accurate approximations to the true solution. 
One can frequently find norms that are preserved 
by solutions of the leading-order equation 
0,R=LR. However, the term DN (e^v) = Ole?) 
if N is a quadratic nonlinearity. Over the very 
long timescales (i.e., O(c ?)) of interest in these 
approximation problems this O(¢*) term can 


cause uncontrolled growth of R, leading to a 
breakdown in the approximation. In order to 
control |12] one must typically make use of some 
special features of the problem under consider- 
ation. For instance, it is sometimes possible to 
make a coordinate transformation which elim- 
inates the terms of O(e*) on the right-hand side 
of [12], after which relatively standard methods 
suffice to control the solutions of [12]. 

2. The nonlinear terms in [11]: these terms are of the 
form EPIN (e?v + &9"R) — N'(e^v)] — DN (HR. 
From Taylor's theorem we see that, if the non- 
linear term is reasonably smooth, these terms are 
of Ole”). If 8 > 3, these terms are small and can 
be controlled over the timescales of interest by a 
straightforward application of Gronwall’s inequal- 
ity or standard “energy estimates." 

3. Finally, one must consider the influence of the 
inhomogeneous terms £” Res(e^v). Note that if 
this term is small enough, say O(c"), with 8 > 3 
this term can also be controlled over the relevant 
timescales by an application of the Gronwall 
inequality. In order to make this term small, we 
need to be sure that our approximation &^w fails 
to solve the true equation at any given time by a 
small amount. In doing so, we can exploit the 
fact that we can add to our leading-order 
approximation terms of higher order without 
affecting the fact that to leading order the true 
solution is still approximated by the solution of the 
modulation equation. This is the role of the term 
ctp in the approximation [6] in the previous 

section. The leading-order approximation is given 
by the functions U and V which solve the KdV 
equations but by adding the additional term ef% to 
the approximation we cancel the remaining terms of 
Ole) in [7], thereby reducing the size of the residue 
in that example to O(c*). This method works in 
other examples as well so that the inhomogeneous 
term in [11] can usually be treated by this means. 
However, in each case, we must prove that the 
additional terms one adds to the approximation 
remain bounded over the timescales of interest and 
demonstrating this fact may not be as easy as it was 
in the case of the FPU model where the additional 
term satisfied a simple wave equation. 


Using this approach one can show that the 
approximation derived heuristically in the previous 
section does accurately model the behavior of 
solutions of the FPU model over the expected 
timescales. More precisely, if r(j, t) is the solution 
of [3] and if U and V are the solutions of the 
modulation equations [9] (with appropriately 
chosen, small-amplitude, long-wavelength initial 
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conditions), one can prove (see Schneider and 
Wayne (1999)) that for any To > 0 there is an £ọ > 
0 and C > 0 such that for all 0 < & < &o, 


sup ||r(-,t) — (e? U(e( + et), et) 
telO, To /&?] 


+ (V (el — et), e°t))lle € Ce”? 


One can also use this method to show that the 
solution of the water-wave problem with general 
small-amplitude, long-wavelength, initial data can 
be approximated by the sum of the solutions of a 
pair of uncoupled KdV equations (Schneider and 
Wayne 2000), one representing the left-moving part 
of the solution and one representing the right- 
moving part of the solution, though in this context 
the technical difficulties associated with the exis- 
tence theory for the water-wave problem mean the 
details are quite a lot more complicated. 


Integrability of the KdV Equation 


One reason that normal forms for systems of 
ordinary differential equations are so useful is that 
they are frequently integrable — that is, they possess 
sufficiently many integrals, or constants of motion, 
that essentially explicit formulas for their solutions 
can be obtained. Remarkably, the same is true for 
the KdV equation and for many other modulation 
equations. Àn argument for why this is so has been put 
forth by Calogero and Eckhaus based on the univer- 
sality of these equations — see Calogero and Eckhaus 
(1987) and references therein, as well as the article 
Integrable Systems: Overview for more on this point. 

Recall that Boussinesq and Korteweg and de Vries 
introduced the KdV equation to study solitary 
traveling waves on a fluid surface. For [1], one has 
an explicit family of such solutions given by: 


u(x,t) = 2A?sech'(Alx --4A?t), A0 


Note that from this formula one sees that waves of 
large amplitude are narrower and travel faster than 
waves of small amplitude. 

In a famous numerical study, Zabusky and 
Kruskal made a remarkable discovery. They con- 
sidered solutions of the KdV equation in which a 
solitary wave of large amplitude overtook one of 
smaller amplitude. They found that after a highly 
nonlinear interaction the two solitary waves re- 
emerged with their original amplitudes and speeds 
and the only reminder of their interaction was a 
phase shift in their relative positions. Their discov- 
ery began a search for a mathematical explanation 
of this remarkable “nonlinear superposition princi- 
ple" which culminated with the solution of the KdV 


equation via the method of inverse scattering and 
the identification of the KdV equation as an infinite- 
dimensional, completely integrable Hamiltonian 
system. 

We begin by describing how a transformation 
discovered by Miura (1968) and then generalized by 
Gardner et al. (1974) leads very easily to the 
conclusion that there are infinitely many conserved 
quantities for the KdV equation. The basic idea is 
that given a transformation which maps solutions of 
one equation to solutions of a second, the existence 
of simple or “obvious” conserved quantities for the 
first equation may lead, via the transformation, to 
more complicated conserved quantities for the 
second. 

Given u=u(x,t), define w(x,t) implicitly via the 
formula 


u(x,t) = w(x,t) + ic0,w(x, y) + e^ (w(x, t))* [13] 


Note that if w is smooth enough and e is small, we 
can invert this relation recursively to obtain w in 
terms of u via the formula 


w =u — iO u — £^ (u^ + Ou) 
+ ie? (0?u + 4u0*u) + e&* (22? + 5(O.u)* 
+ 6u02u + tu) + Ole’) [14] 


Now compute 


Ou — ou — 6u0,.u 
= {ðw — 6wO,.w — 6e’ w 0,w — Ow} 
23 227 w{O,w — 6wd,w — 6e w? yw — w} 


+ i€0,{ Ow — 6wd,w — 6e w ðw — Rw [15] 


From this we see immediately that if w satisfies the 
modified KdV equation 


Qw = 6(wOyw + c w Ow) + Bw [16] 


then z, defined by [13] satisfies the KdV equation. 
However, one also sees immediately that the integral 
of w is a conserved quantity of [16] for all values of 
E, that is, if we define Z-(t) =| w(x, t) dx, then T+ is a 
constant for all values of e. (We will assume here 
that w is defined on the real line, and that w and its 
derivatives go to zero as |x| tends to infinity. Similar 
results hold for x running over a finite interval with 
periodic boundary conditions.) But this in turn 
immediately implies that if we use [14] to expand 
T. in powers of £ the coefficients in this expansion 
must also be constants in time. Since these coeffi- 
cients will be expressed as integrals of u and its 
derivatives, they will give us (infinitely many) 
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conserved quantities for the KdV equation! Looking 
at the first few of these we find: 


1. Ko — f u(x, t) dx. The conservation of this quan- 
tity follows immediately from the form of the 
KdV equation. 

2. Ki — f u(x, t) dx —0, if we assume that u and 
its derivatives tend to zero as |x| tends to infinity. 
Thus, we gain no new information from this 
quantity and in fact, all the integrals coming 
from the odd powers of £ turn out to be “trivial” 
so we ignore them and focus just on the even 
powers of e. 

3. Ko — f (u? + 02u) dx — [ u* dx. That this is a con- 
served quantity is again easy to see directly from 
the KdV equation, just by multiplying the 
equation by z and integrating with respect to x. 

4. Ka=[ (Bu? + 5(Opu)* + 6u02u 4- Ou) dx — [ (3u?— 
(O,.u)*) dx. The origin of this integral is not so obvious 
and we comment further on its meaning below. 


Clearly by continuing this procedure we can generate 
an infinite number of conserved quantities for the KdV 
equation. Indeed, if one chose another conserved 
quantity for the modified KdV equation, [16], say 
| w^ (x, t) dx one could generate another sequence of 
conserved quantities via this same procedure. How- 
ever, Kruskal, Miura, Gardner, and Zabusky proved 
that in fact, all of the conserved quantities that can be 
written as polynomials in uw and its derivatives are 
already obtained by the procedure above. 

The constant of the motion K4 found above is of 
particular interest because one can write the KdV 


equation as 
a (Ok | 
uy = Ox eJ [17] 


where 6/óu denotes the variational derivative of K4 
with respect to u(x). One can interpret this equation 
as a Hamiltonian system where ô, defines the 
(nonstandard) symplectic structure and remarkably, 
Zhakarov and Faddeev (1971) proved that the KdV 
equation is actually a completely integrable Hamil- 
tonian system. In particular, there exists a canonical 
transformation such that with respect to the new 
coordinates the Hamiltonian is a function only of 
the action variables (and hence in particular, the 
action variables remain constant in time). The 
transform which brings the Hamiltonian into its 
action-angle form is known as the inverse spectral 
transform and its details would take us beyond the 
limits of this article. However, very briefly, by 
observing that the Miura transformation [13] 
defines a Ricatti differential equation, and using 
the transformation that converts the Ricatti 


equation to a linear ordinary differential equation 
one can relate the solution of the KdV equation to 
an eigenvalue problem for a linear Schrödinger 
operator. The potential term in the Schródinger 
operator is given by the solution u(x,t) of the KdV 
equation. Remarkably, it turns out that the eigen- 
values of this Schrodinger operator are constants of 
the motion if z is a solution of the KdV equation 
and are very closely related to the action variables 
for the Hamiltonian system. For more details on the 
inverse-scattering method and its use in solving the 
KdV equation we refer the reader to the mono- 
graphs of Ablowitz and Segur (1981), Newell 
(1985), or the recent book by Kappeler and Péschel 
(2003) which develops the theory for the KdV 
equation on a finite interval with periodic boundary 
conditions in a particularly elegant fashion. 


Other Mathematical Aspects of the 
KdV Equation 


In addition to the inverse-scattering transform 
approach, more traditional approaches to the exis- 
tence and uniqueness of solutions have also been 
studied, starting with Temam's proof of the well- 
posedness of solutions of the KdV equation with 
periodic boundary conditions in the Sobolev space 
H^. Noting that the Hamiltonian for the 
KdV equation described in the preceding section 
is closely related to the H' norm, this might seem a 
natural space in which to study well-posedness, but 
surprisingly Kenig, Ponce, and Vega, and Bourgain 
showed that the equation is also well posed in 
Sobolev spaces H?, with s< 1 and more recent 
work has extended the global well-posedness results 
to Sobolev spaces of small negative order. Aside from 
their intrinsic interest, these results have other 
physical implications. If one wishes to study statis- 
tical aspects of the behavior of ensembles of solutions 
of these equations, statistical mechanics suggests that 
the natural invariant measure for these equations is 
given by the Gibbs’ measure. However, the Gibbs’ 
measure is typically supported on functions less 
regular than H!, so that in order to define and 
study this measure one needs to know that solutions 
of the equation are well behaved in such spaces. 
Another natural mathematical question arises 
from the fact that the KdV equation is only an 
approximation to the original physical equation. 
Viewed from another perspective, the original 
system can be seen as a perturbation of the KdV 
equation. It then becomes natural to ask whether the 
special features of the KdV equation are preserved 
under perturbation. Viewing the KdV equation as a 
completely integrable Hamiltonian system this is 
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very analogous to the questions studied by the 
Kolmogorov—Arnol’d—Moser (KAM) theory and 
has led to a development of KAM-like results for a 
number of different partial differential equations 
like the KdV equation. The results are somewhat 
technical in nature but roughly speaking they say 
that if one considers the KdV equation with periodic 
boundary conditions, temporally periodic or quasi- 
periodic solutions will persist under small perturba- 
tions. The situation is more complicated and less 
well understood for the equation on the whole line 
due to the presence of a continuum of scattering 
states. For a very thorough review of the problem 
with periodic boundary conditions see Kappeler and 


Poschel (2003). 


Other Modulation Equations 


As we stressed in its derivation, the KdV equation is 
an appropriate modulation equation for small- 
amplitude, long-wavelength solutions in dispersive 
nonlinear partial differential equations. However, as 
mentioned in the section “Derivation of the KdV 
equation” the method of multiple scales does not give 
a unique modulation equation even in this specific 
physical regime. Already in his original studies 
Boussinesq derived at least three different model 
equations for small-amplitude, long-wavelength 
water waves and a variety of such models continue 
to be studied today. For instance, an easy variation in 
the derivation of the KdV equation leads to the 
regularized long wave, or Benjamin—Bona—Mahoney 
equation in which the 02% term in the KdV equation 
is replaced by the term 020,u. The validity of these 
alternatives to the KdV equation can also be studied 
with the aid of the methods described in the section 
“Validity of the Kdv approximation.” 

There have been many discussions of which of these 
modulation equations is the “correct” one. while they 
may all yield equivalent approximations to the original 
physical problem the KdV equation has at least two 
advantages: it is independent of the expansion para- 
meter £, and it is completely integrable. None of the 
other equations that have been proposed as approx- 
imations to these small-amplitude, long-wavelength 
phenomena share both of these properties. 

If we think in terms of the Fourier transforms of 
the long-wavelength functions studied above they 
are solutions whose Fourier transform is concen- 
trated near zero. One can also ask about modulation 
equations for solutions whose Fourier transform is 
concentrated about nonzero wave numbers. Such 
solutions represent a wave train with some fixed 
underlying wavelength, \., modulated on a much 
longer length scale, A,/e. 


If we make the ansatz that the solution has the 
form 


u(x, t) X eA(e(x — Cpt), pe rac 
+ complex conjugate (18) 


and insert this hypothesized form of the solution into 
the original equation, then under mild assumptions 
on the form and properties of the original equation, 
similar to those under which we derived the KdV 
equation in an earlier section we find that to the 
lowest, nontrivial order in e, the amplitude A evolves 
according to the nonlinear Schródinger equation 


—iðrA = c102 A + c; A|A[ [19] 


If cı and c? are both real, the nonlinear Schrödinger 
equation can also be solved via the inverse-scattering 
method and it represents another completely integr- 
able modulation equation. 

In this article, we have discussed modulation 
equations only for Hamiltonian, or conservative 
systems. However, similar equations have also played 
an important role in the study of dissipative 
equations like the Navier-Stokes equation. The 
most common modulation context in that setting is 
the Ginzburg-Landau equation, which can be derived 
as a modulation equation for Taylor-Couette rolls or 
for the convection rolls in the Rayleigh-Bénard 
problem. Like the nonlinear Schródinger equation, 
the Ginzburg-Landau equation describes how slow 
variations of the amplitude of an underlying periodic, 
pattern evolve and as such it arises in a host of other 
situations in addition to the fluid dynamics examples 
mentioned above. For an extensive review of the 
applications of the Ginzburg-Landau equation, as 
well as its mathematical properties and some special 
solutions, see the recent article of Mielke (2002). 


See also: Bi-Hamiltonian Methods in Soliton Theory; 
Central Manifolds, Normal Forms; Hamiltonian Fluid 
Dynamics; Infinite-Dimensional Hamiltonian Systems; 
Integrable Systems and the Inverse Scattering Method; 
Integrable Systems: Overview; KAM Theory and 
Celestial Mechanics; Multiscale Approaches; Partial 
Differential Equations: Some Examples; WDVV 
Equations and Frobenius manifolds. 
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K-theory was invented in the category of algebraic 
vector bundles over algebraic varieties by 
A Grothendieck, who was directly motivated by 
the Hirzebruch-Riemann-Roch theorem which he 
subsequently greatly generalized. He also defined 
K-homology in terms of coherent sheaves and 
established the basic properties of K-theory 
and K-homology including Poincaré duality for 
nonsingular varieties. The origin for the choice of 
the letter K in K-theory was apparently the German 
word “Klasse.” 

Using the formalism of Grothendieck, M F Atiyah 
and F Hirzebruch (cf. Karoubi 1978), developed 
topological K-theory in the category of topological 
(complex) vector bundles over topological spaces. It 
is this theory that will be the first principal focus of 
this article. A topological (complex) vector bundle 
over a compact topological space X is a topological 
space E together with a continuous map p: E — X 
that is onto, such that p! (x) is a vector space that is 
isomorphic to C" for all x € X, and there is an open 
cover {U} of X together with homeomorphisms 
bu:p '(U) + Ux C" called “local trivializations” 
with the property that hy ob; : UNV x C" 5 UNV 
x C" is of the form (Id, guy), where guy: Ur V — 
GL(m,C) are continuous maps satisfying the 
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following cocycle condition on triple overlaps, 
guvgvwgwu = 1. X x C” is called the trivial vector 
bundle. Two vector bundles p: E —^ X and q:F —^ X 
over X are said to be isomorphic if there is a 
homeomorphism ó:E-F with the property that 
p=qo, and which is a linear isomorphism when 
restricted to each fiber. The direct sum and tensor 
product of vector spaces carries over to vector 
bundles. There are canonical isomorphisms E @ F S 
FE and EQF FQ E, making the set Vect(X) of 
isomorphism classes of complex vector bundles over 
X into a commutative semiring. Vect(X) can be 
made into the commutative ring K°(X) as follows. 
K(X) is generated by pairs ([E], [F]), together with 
the relation ([E], [F)=([E9,[F) if EGF'eGG«z 
E' Fe G for some [G] € Vect(X). Also K'(X) is 
defined to be the group of homotopy classes of 
continuous maps from X to the infinite unitary 
group. Around the same time, R Bott proved his 
celebrated periodicity theorem, which says that the 
odd homotopy group of the (infinite) unitary group 
is the integers, whereas the even homotopy groups 
are all trivial. Incorporating Bott’s periodicity 
theorem for the unitary group into K-theory, Atiyah 
and Hirzebruch proved that topological K-theory 
K*(X)=K°(X) @K'(X) is a periodic generalized 
cohomology theory, and in what follows, the 
notation K"(X) means n modulo 2. If M is not 
compact, then we can compactify M by adding to it 
a point + “at infinity," and denote it by M^. Let 
Lic — M* be the inclusion, inducing the pullback 


map +: K*(M*)  K*(4-) = Z. Then K*(M) is defined 
to be ker(z'), also called the reduced K-theory. If X, 
is a closed subset of X, the K-theory of the pair 
(X, X4) is defined as the reduced K-theory of the 
quotient space X/X,. A fundamental computation 
of Bott is the computation of the K-theory of 
Euclidean space, K”(IR”) = 'Z with canonical gen- 
erator called the Bott class bc K"(R"), and 
K"-* (R^) — (0). 

Some of the basic properties of K-theory are listed 
as follows. Details can be found in Karoubi (1978). 


1. Pullback If f : N — M is a continuous map, then 
given a vector bundle 4: E — M over M, the 
pullback vector bundle is defined as f* (E) = ((x, v) € 
N x E:f(x) —n(v)] over N. This induces a pullback 
homomorphism, f' : K*(M) — K*(N). 

2. Push-forward Let f : N —^ M be a smooth proper 
map between compact manifolds which is 
K-oriented, that is, TN @f*TM is a spin’ vector 
bundle over N. Then there is a pushforward 
homomorphism, also called a  Gysin map, 
fi: K*(N) > K**4(M). where d= dim M — dim N, 
whose construction will be explained in the next 
section. 

3. Homotopy If f:N—M and g:N— M are 
homotopic maps, then the pullback maps f'—g 
are equal. If in addition, f and g are K-oriented, 
proper maps which are homotopic via proper 
maps, then the Gysin maps fı =g, are equal. 

4. Excision Let M, be a closed subset of M and U 
be an open subset of M such that U is contained 
in the interior of Mı. Then the inclusion of pairs 
(MXU, MiNU) — (M, My) induces an isomorph- 
ism in K-theory, K*(M, Mj) S K*(M\U, Mı \U). 

5. Exactness Let M, be a closed subset of M. Then 
there is a six-term exact sequence in K-theory, 


K'(M,Mi) — K'"(M) — 


i 
, 


| [A 


K' (M1) 


K"(Mi) 


— KW(M) — K(M,M:) 

6. Cup product There is a canonical map given by 
external tensor product,  K'(M) & K(N) > 
K'* (M x N). When N = M, one can compose this 
with the homomorphism induced by the diagonal 
map M — M x M given by x — (x, x), to get a cup 
product, K?(M) & K*(M) — K?*4(M). 

7. Bott periodicity This is arguably the most impor- 
tant property of K-theory. It says that the zero- 
section embedding :“:M M x R” induces a 
Gysin isomorphism, «™):K*(M)= K'*"(M x R”), 
which is given as follows. Let 7.4: M x R” —^ M 
and mgr:M x R” — R” denote the projections 
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onto the factors, and b — 41 € K"(R") the Bott 
element, where ;:(0] — R” is the inclusion of the 
origin. Then the Bott periodicity isomorphism is 
given by M(x) 2 c (x)U Tign(b) € K**"(M x R") 
for all x € K*(M). 


Using the fact that any vector bundle over a 
contractible space is trivial, together with Bott's 
periodicity theorem, one deduces the calculation 
of the K-theory of spheres. The calculation for the 
odd-dimensional spheres given, K?(S^"-1) & Z e 
Kl(S?"-!) and for the even-dimensional spheres 
K9(S?n-1j) & Z^ and K!(S?") e [0], for all n> 1. 

There is a natural homomorphism of rings called 
the Chern character, Ch: K*(X) — H*(X,Q) which 
is characterized by the following axioms: 


1. Naturality If f : N — M isa smooth map, and if 
E is a vector bundle over M, then Ch(f'(E)) = 
f*(Ch(E)). 

. Additivity Ch(E @ F)=Ch(E) + Ch(F). 

3. Normalization If L is the canonical line bundle 
over CP" which restricts to the Hopf line bundle 
over CP!, then Ch(L)= exp(x), where x is the 
generator of H*(CP",Z) S Z. 


Atiyah and Hirzebruch, cf. Karoubi (1978), also 
proved that the Chern character induces an iso- 
morphism of the rings K*(X) & Q and H*(X, Q). The 
Chern-Weil representative of the Chern character is 
tr(exp((i/27)Qe)), where Qg is the curvature of a 
Hermitian connection on E. 

There are many variants of K-theory, such as 
KO-theory, where the unitary group is replaced 
by the orthogonal group, which is periodic of 
order eight, and G-equivariant K-theory, where G 
is a compact Lie group. K-theory and its variants 
have many interesting applications such as deter- 
mining the maximum number of linearly inde- 
pendent vector fields on spheres, which is due to 
Adams, cf. Karoubi (1978). We will content 
ourselves with the description of two important 
applications. 


ho 


Grothendieck-Riemann-Roch Theorem 
for Smooth Manifolds 


Recall that an oriented real vector bundle E over M is 
said to be a spin’ vector bundle if the bundle of 
oriented frames on E, SO(E) has a circle bundle 
Spin‘ (E) such that the restriction to each fiber yields 
the central extension 0 — U(1) — Spin“ (n) E 
SO(n) — 0 that defines the group Spin‘ (n), where n 
is the rank of E. It turns out that the obstruction to the 
existence of a spin’ structure on E is the third integral 
Stieffel-Whitney class of E, W3(E) € H?(M,Z.). 
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A generalization of Bott periodicity is the Thom 
isomorphism in K-theory. It says that if 7: E— M is 
a rank-n spin’ vector bundle over M, then the zero- 
section embedding +“ : M — E induces a Gysin iso- 
morphism, ;M,: K*(M) = K**"(E), which is given as 
follows. There is a canonical element / ^1 € K"(E) 
called the Thom class in K-theory, which is character- 
ized by the property that 1:1 restricts to give the Bott 
class on each fiber. Then the Thom isomorphism in K- 
theory is given by M(x) ^ a (x) U M1 € K**"(E) for 
all x € K*(M). For canonical representatives of the 
Thom class, cf. Mathai-Quillen Formalism, or Mathai 
and Quillen (1986). 

Recall the definition of the Gysin map for smooth 
embeddings. Let X be a smooth, compact manifold, 
and Y a smooth manifold. Let h: X — Y be a smooth 
embedding that is K-oriented. Since TX @ TX has a 
canonical almost-complex structure, it follows that 
the normal bundle NyX=h*(TY)/TX is a spinC 
vector bundle. If :* : X — NyX is the zero-section 
embedding, then we have the Thom isomorphism 
p : K'(X) = K**"(Ny X), where n = dim(Y)— dim(X) 
is the codimension of the embedding. Upon choosing a 
Riemannian metric on Y, there is a diffeomorphism ^ 
from a tubular neighborhood U of b(X) onto a 
neighborhood of the zero section in the normal bundle 
((X). That is, ®': K(NyX)  K*(U). For any open 
subset 7: U — Y, the extension by zero defines a 
homomorphism /: K*(U) — K*(Y). Then the Gysin 
map of the embedding þh is defined as bi — jo ®' o 
us: K*(X) + K**"(Y), which turns out to be inde- 
pendent of the choices made. 

Next recall the definition of the Gysin map for 
smooth submersions. Let 7: Y — Z be a smooth 
submersion of smooth manifolds, which is K- 
oriented and a proper map. Since every smooth 
compact manifold can be smoothly embedded in 
R°? for q sufficiently large, a parametrized version 
yields an embedding &: Y — Z x R^! that is spinC. 
Therefore the Gysin map is a homomorphism 
ki: K*(Y) 5 K***(Z x R*4), where a= dim(Z) + 2q— 
dim(Y). Let /4:Z — Z x R^4 denote the zero-section 
embedding. Then we have the Thom isomorphism 
d IK*(Z) = K*"(Z x R^). Then the Gysin map 
of the submersion m is defined as m-ko(iz)!: 
K*(Y) ^ K**(Z),, where b= dim(Y) — dim(Z), and 
turns out to be independent of the choices made. 

Let f:N — M be a smooth proper map that is 
K-oriented. Then f can be canonically factored, first 
into the smooth embedding grí(f): N — N x M, 
which is the graph of the function, that is, 
gr(f)(x)—(x,f(x), and which is K-oriented. The 
Gysin map is gr(f): K*(N) — K**dimM)(N x M). 
Second, the projection py:NxM—M is a 
K-oriented proper submersion, when restricted to 


the image of gr(f). The Gysin map is py: K*(M x 
N) — K***(M), where b= dim(N). The Gysin map 
of f is defined as fi = pm, ogr(f) : K'(N) —^ K**4(M), 
where d= dim(M) + dim(N). 

Given such a smooth proper map f: N — M that 
is K-oriented. Then there are Gysin maps in 
cohomology, f.:H*(N,Q) — H**^(M,Q) (where 
we consider the Z2-grading given by even and odd 
degree), and in K-theory, fi: K*(N) — K**4(M) 
which increases the degree by d= dim(M) + 
dim(N). The Grothendieck-Riemann-Roch theorem 
due to Atiyah and Hirzebruch, cf. Karoubi 1978, in 
the smooth category can be phrased as the commu- 
tativity of the diagram, 


f 


K*(N) +> K**4(M) 


Toddi TNIUCh | Fodd( TM) Ch l 


| f. | 
H*(N,Q) — H*'"(M,Q) 
That is, 


Ch(fi(£)) U Todd( TM) = f.(Ch(£) U Todd(TN)) 
for all £ € K*(N), where Todd(E) is the Todd genus 


characteristic class of a Hermitian vector bundle E 
over M. The Chern-Weil representative of the Todd 


genus is 
(1/27) Qe 
ji e im) 


where Qç is the curvature of a Hermitian connection 
on E. There are many useful variants of this 
beautiful formula. 


The Atiyah-Singer Index Theorem 


The 2004 Abel Prize citation mentions the Atiyah- 
Singer (1971) index theorem as being one of the 
greatest achievements of twentieth-century mathe- 
matics. It has stimulated considerable interaction 
between mathematicians and mathematical physi- 
cists. We content ourselves here with a rudimentary 
description of the results. 

Let F be the space of all Fredholm operators on 
an infinite-dimensional complex Hilbert space H. 
Recall that an operator A is said to be Fredholm if 
both the kernel and cokernel of 4A are finite 
dimensional. The index of such a Fredholm operator 
is index(A) = dim(ker(A)) — dim(coker(A)) € Z. The 
index map is continuous, so it induces a map on the 
connected components of F, which turns out to be 
an isomorphism. 


K-theory is naturally related to the space of all 
Fredholm operators F endowed with the norm 
topology. Any continuous map A: X — F from a 
compact space to F has an index in K°(X), which 
is given by index(A) — ker(A) — coker(A) in the 
special case when dim(ker(A))(x) is constant in x € 
X. In general, one uses the fact that the index is 
stable under compact perturbation, and shows that 
one can always achieve the special case after a 
compact perturbation. It is again the case that the 
index map is continuous, and so induces a map, 
index: [X, F] — K®(X), which turns out to be an 
isomorphism, thanks to a fundamental theorem 
of Kuiper which proves that the group of all 
invertible operators on an  infinite-dimensional 
complex Hilbert space is contractible in the norm 
topology. 

Now let 7: N — Z be a fiber bundle with typical 
fiber a smooth compact manifold M, where N and Z 
are also smooth compact manifolds. Consider a 
smooth family of elliptic operators D = {D;}ez 
along the fibers of m, parametrized by Z, where 
D,:C*(r (z,E|.434-— C*(m(z,F|.15,) and 
E, F are vector bundles over N. Such a family of 
elliptic operators has a symbol 


o(D) : z^ (E) > z'(F) 


where z:T*(N/Z) —n— N is the projection and 
T*(N/Z) is the vertical cotangent bundle. Ellipticity 
for the family is the condition that o(D) is an 
isomorphism outside the zero section, so that the 
triple (7*(E),7*(F),o(D)) determines an element in 
K*(T*(N/Z)) denoted by e(D). 

The analytic index of the family D is index(D) € 
K?(Z), and it turns out that it only depends on the 
class of the symbol o(D) € K°(T*(N/Z)), so the 
analytic index can be viewed as a homomorphism, 


index : K°(T*(N/Z)) — K°(Z) 


Consider an embedding ;: N =Z x R” that is 
compatible with the projection 7:N — Z. The 
fiberwise differential is an embedding di: T(N/Z) — 
Z x R^", which induces a Gysin map 


du : K°(T(N/Z)) — K°(Z x R?") 


upon identifying T*(N/Z) with T(N/Z). Let 
j:Z—>ZxR™”" be the inclusion j(z) —(z,0). It 
induces the Bott isomorphism j: K°(Z) = K°(Z x 
R*"). The topological index of the family D is, by 
definition, 


index, = j, | o du: K°(T*(N/Z)) > K°(Z) 


The  Atiyah-Singer (1971) index theorem 
for families of elliptic operators D asserts the 
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equality of the analytic index and the topological 
index, 


index(D) = index,(a(D)) € K°(Z) 


Combined with the Grothendieck-Riemann-Roch 
theorem, one has the following exquisite formula in 


H*(Z, Q): 
Ch(index(D)) = pz. (Todd(T--(N/Z)) U Ch(a(D))} 


where p: T-(N/Z) — N is the projection. 

The map sending a complex vector bundle E over 
Z to its determinant line bundle det(E)=A™*E 
induces a homomorphism, det: K°(Z) — zo(Pic(Z)), 
where z9(Pic(Z)) denotes the isomorphism classes of 
complex line bundles over Z. Then 


c1 (det(index(D))) 
= {p.m.{Todd(T’(N/Z)) U Ch(s(D))) 


where |?! denotes the degree-2 component, and the 
left-hand side denotes the first Chern class of the 
determinant line bundle of the index class. This 
formula is often used in the study of anomalies in 
physics. 


K-Theory of C*-Algebras 


The Gelfand-Naimark theorem asserts that unital 
abelian C*-algebras A can be identified with the 
space of continuous functions C(X), where X is the 
compact Hausdorff space known as the spectrum of 
A, consisting of characters of A. Conversely, given a 
compact Hausdorff space X, the characters of C(X) 
consist of the evaluation maps at points of X. 

Let E be a vector bundle over X. Then there is a 
vector bundle F over X such that E 9 F = X x C", 
Setting A=C(X), M=C(X, E) N=C(X,F), we 
see that M@N = A", showing that each vector 
bundle E over X determines a canonical finite 
projective module M over A. The converse is also 
true and is a result of Serre and Swan, cf. Blackadar 
(1986), which asserts that every finite projective 
module M over A is the space of all continuous 
sections of a vector bundle over X. So we have an 
equivalence of the category of vector bundles over X 
and the category of finite projective modules over A. 

This motivates the following generalization of 
topological K-theory for a general unital C*-algebra 
A. Let Proj(A) denote the isomorphism classes of 
finite projective modules over A. It is a commutative 
semigroup under the operation of direct sum, which 
can be made into the commutative group Ko(A) as 
follows: Ko(A) is generated by pairs ([.Mt], [.N ]), 
together with the relation ({MJ], LV ]) = (LMt'], [A ]) 
if MON @GYM ON OG for some [G] € Proj 
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(A). Also K4(A) —0(GL(oo, A)) where GL(oc, A) 
denotes the direct limit of GL(n,A) where 
(GL(z,A) embeds in GL(z 4- 1, A) as 1 6 GL(z, A). 
Then, defining Kj(A) 27; 1(GL(oo, A)) for 7>1, 
together with generalized Bott periodicity which 
asserts that there is a canonical isomorphism 
7Tj.1(GL(oo, A)) S m41(GL (oo,A), we see that 
K.(A)=Ko(A) & Ki(A) is a generalized periodic 
cohomology theory. If A is a C*-algebra without 
unit, then consider A* =A $ C, with product given 
by (a, A)(b, ui) — (ab + ap+ bà, Ap) with unit (0, 1). 
The projection p:A*—C defined as p(a,4)—A 
induces a map pi: K,(A*) — K,(C). In the nonunital 
case, K,(A) is defined as ker(p;). Observe that 
K,(A)=K,(A*), but this is often not the case with 
Ko. It is easy to see that when A has a unit, then the 
two definitions of Kg agree. An important caveat in 
the case of noncommutative C*-algebras is that the 
K-theory is often not a ring as there is no analog of 
the tensor product operation. 

Some of the basic properties of K-theory are listed 
as follows. Details can be found in Blackadar 


(1986). 


1l. Cup product A continuous bilinear map of 
C*-algebras, A x B — C, induces a cup product, 
K;(A) & KiB) — K;,;(C). 

In particular, the continuous product A x A—A 
induces a cup product homomorphism, 
K;(A) & Kj(A) — Ki4(A). 

2. Induced homomorphism | 1 f : A — B is a homo- 
morphism of C*-algebras, then there is an 
induced homomorphism, fı: K,(A) — K.(B). 

3. Homotopy If f:A— B and g:A- B are 
homomorphisms of C*-algebras that are homo- 
topic, the induced homomorphisms on K-theory 
f, = ge are equal. 

4. Excision If I is a closed two-sided ideal in A, 
then there is a six-term exact sequence in 
K-theory, 


Ko(I) —  Ko(A) — Ko(A/I) 
T [A 
K\(A/I) — K,(A) — KiMi) 


5. Morita invariance The inclusion homomorph- 
ism of A into the top left of the diagonal in 
M,(A) induces an isomorphism in K-theory, 
K,(A) = K.(M,,(A)). 

6. Continuity Let A= lim, .4, A, be a C*-direct 
limit. Then, K,(A) = lim, >œ Ke(A,). 

7. Stability Let K be a C*-algebra of all compact 
operators on an infinite-dimensional complex 
Hilbert space. Then since K= lim, .4. My(C) is 


a C*-direct limit, we see that K,(A&XK)— 
liig oG K.(A C9 M,,(C)) = K,(A). 

8. Bott periodicity The continuous product A x 
C — A induces the cup product K;(A) & K;(C) => 
K;,;(A). The computation by Bott asserts 
that there is a canonical element b € K5(C) that 
gives an isomorphism K>(C)2Z, and 
Bott periodicity asserts that the cup product 
with b gives rise to an isomorphism K;(A) = 
Kj+2;(A). 


We mention in passing that Connes has defined a 
Chern character homomorphism, Ch:K,(A) — 
HE,(A), mapping into the entire cyclic homology 
of A, having similar properties as the ordinary 
Chern character. Due to space constraints, it will 
not be defined here. 


A C*-Algebra Generalization of the 
Atiyah-Singer Index Theorem and 
the Baum-Connes Conjecture 


We content ourselves here with a rudimentary 
account of the C'-algebra generalization of the 
Atiyah-Singer index theorem and the Baum-Connes 
conjecture, and its relevance to the quantum Hall 
effect and strict deformation quantization. Let A be 
a C*-algebra. 

Let Ha =A & H, which is the analog of a Hilbert 
space. Let Fa be the space of all A-Fredholm 
operators on 714. Recall that an operator T is said to 
be A-Fredholm if both the kernel and cokernel of T + 
K are closed and finitely generated projective modules, 
where K is an A-compact operator. The space of 
A-compact operators is by definition the closure of 
the A-finite rank operators. The index of T is 


index(T) = [ker(T + K)] — [coker(T + K)| € Ko(A) 


The index map turns out to be well defined and 
independent of the choice of A-compact perturba- 
tion K. It is continuous, so it induces a map on the 
connected components of Fa, which turns out to 
be an isomorphism, by a theorem of Mingo 
(cf. Rosenberg (1983, 1989)). 

Now let M be a smooth compact manifold. An 
A-vector bundle over M is a locally trivial Banach 
vector bundle E over M whose fibers have the 
structure of finitely generated left A-modules, with 
morphisms respecting the A-module structure. The 
isomorphism classes of A-vector bundles over M 
form a commutative semigroup under direct sums, 
and the associated commutative group is easily 
identified with Ko(C(M) & A). Let D: C*(M,E) — 
C^ (M, F) be an elliptic A-operator acting between 
smooth sections of A-vector bundles E, F over M. It 


turns out that by elliptic regularity, such an operator 
is A-Fredholm, and has an analytic index, 


index(D) € Ko(A) 
Associated to each such operator is a symbol 
c(D) : &* (E) ^ x'(F) 


where 7: I'M — M is the projection. Ellipticity is 
the condition that o(D) is an isomorphism outside 
the zero section, so that the triple (z* (E), 7*(F),0(D)) 
determines an element in Ko(Co(T* M) @ A) denoted 
by o(D). It turns out that the analytic index of D 
depends only on the class o(D) € Ko(Co(T* M) & A). 
Therefore, the analytic index can be viewed as a 
homomorphism, 


index : Ko(Co( T' M) & A) > Ko(A) 


Consider an embedding 1: M — R”, which induces 
an embedding di: TM — R”. The associated Gysin 
map 1S du : Ko(Co(T* M) G9 A) — Ko( Co( R^") Q) A). 
Let j: {0} — R” denote inclusion of the origin in R”. 
It induces a Gysin map ji: Ko(A) — Ko(Co( R7") & A) 
which is the Bott periodicity isomorphism. Then the 
topological index is the homomorphism 


index, = jf, ' o du : Ko(Co(T' M) @ A) — Ko(A) 


The C*-generalization of the  Atiyah-Singer 
index theorem due to Mishchenko-Formenko, cf. 
Kasparov (1988), asserts the equality of the 
analytic index and the topological index, 


index(D) = index,(a(D)) € Ko(A) 


Now let M be a compact even-dimensional 
spin’ manifold. Then there is a spin‘ Dirac 
operator D: C*(M,S*) — C*(M,S^), where S* is 
the bundle of half-spinors on T*M & L, where L is 
a line bundle over M with the property that the 
first Chern class of L modulo 2, ci(L)mod 2 is 
equal to the second Stieffel-Whitney class of M, 
w2(M). Let T be a torsion-free discrete group, and 
BT be its classifying space. It is a paracompact 
space with the property that it is the quotient of T 
acting freely on a contractible space ET. Let CT (I) 
denote the reduced group C*-algebra, and consider 
the canonical flat C'(T) bundle V over BI defined 
as follows: 


y = {ET x C (T))/T 


where T acts on the left on C; (T) and on the right on 
ET. Let f : M — BT be a continuous map. Then f*V 
is a flat C(F)-bundle over M. Upon choosing a flat 
connection on f*V, we can couple the spin‘ Dirac 
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operator Dy to act on sections of $* @f*V. The 
ellipticity of Dy ensures that it is a C:(D)-Fredholm 
operator, so it has an analytic index, index(Dy) € 
Ko(C;(FP) by the earlier discussion, which is 
also equal to the topological index indexi(c(Dy)) € 
Ko(C7(I)). 

By Baum, Connes, and Douglas, the K-homology 
of BT, Ko(BI), is generated by the triples (M, E, f) as 
described above, modulo relations that we will not 
present here because of space constraints. The 
assembly map 


ju: Ko(BT) — Ko(C; (D)) 


is a homomorphism given by jp(|(M,E,/f)])= 
index(Dy). The Baum-Connes conjecture asserts 
that j is an isomorphism. There are variants of 
this conjecture when I has torsion. The Baum- 
Connes conjecture has been verified when [ is an 
amenable group or, for instance, a word hyperbolic 
group. There are also variants of this conjecture for 
certain foliations and groupoids, and is an extremely 
active area of research. The injectivity of the 
assembly map is related to the Novikov conjecture 
on the homotopy invariance of the higher signatures 
(Kasparov 1988), and the obstructions to the 
existence of Riemannian metrics of positive scalar 
curvature on compact spin manifolds (Rosenberg 
1983, 1989). A variant of the Baum-Connes 
conjecture, where the reduced group C*-algebra is 
replaced by the twisted reduced group C*-algebra, is 
used in the analysis of the noncommutative geome- 
try approach to the integer and fractional quantum 
Hall effect, and also the gaps in the spectrum of 
magnetic Schródinger operators (Bellissard et al. 
1994, Marcolli and Mathai 2001). 


Twisted K-theory and the Chern 
Character 


We begin by reviewing some results due to Dixmier 
and Douady (1963). Let M be a smooth manifold, let 
H denote an infinite-dimensional, separable, Hilbert 
space and let K be the C*-algebra of compact 
operators on H. Let U(H) denote the group of 
unitary operators on H endowed with the strong 
operator topology and let PU(H) = U(H)/U(1) be the 
projective unitary group with the quotient space 
topology, where U(1) consists of scalar multiples of 
the identity operator on H of norm equal to 1. Since 
U(H) is contractible in the operator norm topology, it 
follows that PU(H) = BU(1) is an Eilenberg-MacLane 
space K(Z,2). Therefore, BPU(H) is an Eilenberg- 
MacLane space K(7,3). That is, principal PU(H) 
bundles P over X are classified up to isomorphism by 
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the Dixmier-Douady class DD(P) in H?^(X, Z) and 
conversely. 

For g € U(H), let Ad(g) denote the automorphism 
T—gTg! of K. As is well known, Ad is a 
continuous homomorphism of U(H), given the 
strong operator topology, onto Aut(K) with kernel 
the circle of scalar multiples of the identity where 
Aut(X) is given the point-norm topology. Under this 
homomorphism we may identify PU(H) with 
Aut(K). Define an Azumaya bundle to be a locally 
trivial bundle € over X with fiber K and structure 
group Aut(K). They are of the form Kp={P x K}/ 
PU(H) and isomorphism classes of Azumaya bundles 
are also parametrized by their Dixmier-Douady 
class DD(P) in H?(X,Z) and conversely. 

Since KOK SK, the isomorphism classes of 
locally trivial bundles over X with fiber K and 
structure group Aut(K) form a group under the 
tensor product, where the inverse of such a bundle 
is the conjugate bundle. This group is known as 
the infinite Brauer group and is denoted by Br™(X). 
So, a restatement of the Dixmier-Douady theorem 
is that Br?(X)& H?^(X,Z).H?(X,Z.) can also 
be described in terms of bundle gerbes (Murray 
1996). 

The twisted K-theory, K*(X, P), is defined as the 
K-theory of the C*-algebra of continuous sections of 
the Azumaya bundle Kp,K.(C(X,Kp)). It was 
studied in the torsion case by Donovan and Karoubi, 
where one can replace the compact operators K by 
finite-dimensional matrices, and was studied in the 
general case by Rosenberg (1983, 1989). Let F be 
the space of all Fredholm operators endowed with 
the norm topology. Then, one can form the bundle 
of Fredholm operators Fp ={P x F}/PU(H), where 
PU(H) acts on F via the adjoint action. Consider the 
fibration Kp — Fp — GL(Cp), where Cp — (P x C}/ 
PU(H) and C — B(H)/K is the Calkin algebra. Since 
mo(C(X,Kp))={0}, we see that mo(C(X,Fp))= 
mol CX, GL(Cp))). Consider the short exact sequence 
of C*-algebras, 


0 =» GLX, Kp) — CCX, Bp) — CX, Cp) — 0 


where Bp={P x B(H)}/PU(H) and where PU(H) 
acts on B(H) via the adjoint action. It gives rise to 
a six-term exact sequence 


en — Ko(C(X,Bp)) — — 
index 
Ki(C(X,Cp)) —— Ki1(C(X.Bp)) — Ki1(C(X,Kp)) 


By definition, Kj(C(X,Cp)) S 190(C(X, GL(oo,Cp))) 
and a standard argument shows that this is also 
equal to mo(C(X,GL(Cp))). By Kuiper's theorem, it is 


not difficult to see that 
Therefore, 


K.(C(X, Bp)) —10j. 


index: z9(C(X, Fp)) > K°(X, P) 


is an isomorphism. Let X, be a closed subset of X, 
and Ix, be the closed ideal of sections of Kp that 
vanish on X41. Then K*(X, X41, P) is by definition 
K,(Ix,). A geometric description of twisted K-theory 
in terms of modules for bundle gerbes is described in 
Bouwknegt et al. (2002). 

Some of the basic properties of twisted K-theory 
are listed as follows. Many of these properties 
follow from the corresponding properties for the 
K-theory of C'-algebras. See Atiyah and Segal and 
Bouwknegt et al. (2002). 


1. Normalization It P is trivial, then K*(M,P) — 
K*(M). 

2. Module property K*(M,P) is a module over 
K°(M). 

3. Pullback If f:N—M is a continuous map, 
and P a principal PU(H) bundle over M, then 
there is a pullback homomorphism f : K*(M, P) > 
K*(N, f(P)). 

4. Push-forward Let f : N —^ M be a smooth proper 
map between compact manifolds which is K- 
oriented, that is, TN @f*TM is a spin’ vector 
bundle over N. Let P be a principal PU(71) bundle 
over M. Then there is a pushforward homomorph- 
ism, also called a Gysin map, f : K*(N,f'(P)) > 
K***(M, P), where d= dim M — dim N. 

5. Homotopy If f:N — M and g:N — M are 
homotopic maps, then the pullback maps f —g 
are equal. If in addition, f and g are K-oriented, 
then the pushforward maps fi = gi: are equal. 

6. Excision Let Mı be a closed subset of M and U 
be an open subset of M such that U is contained 
in the interior of Mı. Then the inclusion of 
pairs (M\U,M,\U)<+(M,My,) induces an iso- 
morphism in K-theory, K*(M, Mji,P) = 
K*(MNU, MiINU,P| mu): 

7. Exactness Let Mı be a closed subset of M and 
L: M4 — M be the inclusion. Let P be a principal 
PU(H) bundle over M. Then the short exact 
sequence 


0 — Im, > C(M, Kp) > C(Mi, Kp, ) > 0 


gives rise to the six-term exact sequence in K-theory, 


K°(M, My, P) — K°(M, P) — K?(Mi, t (PY) 
{A 
K? (My, i (P)) — K! (M, P) — K' (M, Mj, P) 


8. Cup product Let P be a principal PU(H) bundle 
over M and O be a principal PU(H) bundle over N. 
An identification H & H & H gives rise to a principal 
PU(H) bundle P @ Q over M x N whose Dixmier- 
Douady invariant is DD(P & Q) —- pi ( DD(P)) + 
p3(DD(O)), where p; denote projections onto the 
jth factor, / — 1,2. Then there is a canonical map 
given by external tensor product, 


K'(M, P) ® K(N, Q) > K'"(M x N, P & Q) 


called the cup product. 

9. Bott periodicity Let P be a principal PU(H) 
bundle over M. Bott periodicity says that there is 
a canonical isomorphism 


K*(M, P) = K**"(M x R",a(P)) 


where 7: M x R” — M is the projection onto the 
first factor. Let b € K"(R") be the Bott element. 
Then the isomorphism above is given by z'(x)U 
b € K**"(M x R",z'(P)) for all x € K*(M, P). 


There is a natural homomorphism of rings called the 
twisted Chern character, which depends both on a 
choice of P and a de Rham representative H of DD(P), 


Chp : K*(M, P) — H*(M,H) 


Here H*(M,H) denotes the twisted cohomology, 
which is by definition the cohomology of the 
complex (O*(M),d — H^). The twisted Chern char- 
acter is characterized by the following axioms: 


|. Naturality If f : N — M is a smooth map, and if 
x € K*(M, P), then Chjp(f'(x)) = f*(Chp(x)). 
2. Additivity If x,y € K*(M, P), then Chp(x @ y) = 


Chp(x) T Chp(y). 
3. Chp respects the K°(M)-module structure of 
K9*(M, P). 


4. Normalization If P is trivial, then Chp reduces 
to the ordinary Chern character Ch. 


It turns out that the twisted Chern character 
induces an isomorphism of the rings K*(M, P) & Q 
and H*(M, H). The Chern-Weil representative of the 
twisted Chern character is derived in Bouwknegt 
et al. (2002). 


Twisted K-Theory and Duality in Type li 
String Theories 


Let E be an oriented S!-bundle over M, 


S!— E 


z| 
M 
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characterized by its first Chern class ci(E) € 
H?(M,7Z), in the presence of (possibly nontrivial) 
H-flux H € H*(E, Z). We will argue that the T-dual 
of E is again an oriented S'-bundle over M, denoted 
by E, 


AN 
z 
M 
supporting H-flux H € H?(E, Z), such that 
ci (E) = mH, 
where m, : H*(E, 7.) ^ H*-! (M, Z) and, similarly, m, 


denote the pushforward maps. Then we can form 
the following commutative diagram: 


ExyE 
p DN 
E Ê 
M 


The correspondence space E xy E is a circle bundle 
over E with first Chern class z*(c4(E)), and it is also 
a circle bundle over E with first Chern class 

q'(ci(E), by the commutativity of the diagram 
above. If E=E or if E=M x S!, then the correspon- 
dence space E xm E is difsemorghie to E x $8, 

T-duality gives an isomorphism of the twisted 
K-theories of E and E as well as an isomorphism 
between the twisted cohomologies of E and E, and 
can be expressed in the following commutative 
diagram: 


I 


K*«(E,P) — K*"(E,P) 
Chp | {chy 
H*'(E,H) 5> H*+1(E,A) 


where the horizontal arrows are isomorphisms. Here 
P is a principal PU(H) bundle over E such that 
DD(P) — H and P i is a principal PU(H) bundle over 
È such that DD(P)=H. We refer to Bouwknegt 
et al. (2004) for details. The T-duality isomorphism 
above gives compelling evidence that a type HA 
string theory A on a circle bundle of radius R in the 
presence of a background H-flux, and a type IIB 
string theory B on a “T-dual” circle bundle of radius 
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VR in the presence of a *T-dual" background H-flux, 
are equivalent in the sense that the string states of 
string theory À are in canonical one-to-one correspon- 
dence with the string states of string theory B. 

We briefly mention two other applications of 
twisted K-theory. Consider the adjoint action of a 
compact connected simple Lie group G on itself, 
and the corresponding twisted G-equivariant 
K-theory, twisted by a multiple of the generator 
of H?^(G,Z). The relevance of the equivariant case 
to conformal field theory was highlighted by the 
result of Freed, Hopkins and Teleman (see Freed 
(2002)) that it is graded isomorphic to the 
Verlinde algebra of G, with a shift given by the 
dual Coxeter number. Here the Verlinde algebra 
consists of equivalence classes of positive-energy 
representations of the loop group of G which was 
originally shown to be a ring in a rather nontrivial 
way. On the other hand, the ring structure of the 
twisted G-equivariant K-theory of G is just 
induced by the product on G, which makes this 
result all the more remarkable. 

Fractional analytic index theory, developed in 
Mathai et al. is a generalization of Atiyah-Singer 
index theory, assigning a fractional-valued analytic 
index to each projective elliptic operator on a compact 
manifold, where the fraction need not be an integer. 
These projective elliptic operators act on projective 
vector bundles, where the usual compatibility condi- 
tion on triple overlaps to give a global vector bundle, 
may fail by a scalar factor. These are the geometric 
objects in twisted K-theory, when the twist is torsion. 
In Mathai et al, a fractional index theorem is 
proved, computing the fractional-valued analytic 
index of projective elliptic operators essentially in 
terms of topological data. The Dirac operator in 
the absence of a spin structure is also defined there 
for the first time resolving a long standing mystery, 
and its index is computed. 

Some topics not covered in this brief account of 
K-theory include: KK-theory, cf. Blackadar (1986) 
and Kasparov (1988), which is natural setting for 
the Atiyah-Singer index theorem and its general- 
izations, as well as higher algebraic K-theory. 


See also: C*-Algebras and Their Classification; 
Characteristic Classes; Cohomology Theories; 
Equivariant Cohomology and the Cartan Model; Gerbes 
in Quantum Field Theory; Index Theorems; Intersection 
Theory; Mathai-Quillen Formalism; Spectral Sequences. 
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Introduction 


To describe transport by a random flow, one needs 
to apply the statistical methods to the motion of 
fluid particles, that is, to the Lagrangian dynamics. 
We first present the propagators describing evolving 
probability distributions of different configurations 
of fluid particles. We then use those propagators to 
describe decay and steady states of a passive scalar 
field transported by random flows. 

Consider an evolution of a passive scalar tracer 
(r,t) in a random flow. The mean value of the 
scalar tracer at a given point is an average over 
values brought by different trajectories: 


(8(r,s)) = J P(r,s;R,0) @(R,0) dR [1] 


Here, P(r, s; R, t) is the probability density function 
(PDF) to find the particle at time ¢ at position R 
given its position r at time s. That PDF is called the 
propagator or the Green function. Multipoint 
correlation functions of the tracer 


Cu(r,s) = (0(r,,s) ...0(r,.s)) 
= J Px(r.s:R.0)0(R, .0)...0(Ry.0)AR [2] 


are expressed via the multiparticle Green functions 
Pw which are the joint PDFs of the equal-time 
positions R —(R,,..., Rx) of N fluid trajectories. 
The trajectory of the fluid particle that passes at 
time s through the point r is described by the vector 
Rí(t;r,s) which satisfies R(t;r,t) — r and the stochas- 
tic equation 
R = v(R.t) + u(t) [3] 
Here, u(t) describes the molecular Brownian motion 
with zero average and covariance  (u'(t)w (t')) 
—2xó'ó(t —t') We also consider macroscopic 
velocity v as random with various statistical properties 


in space and time. There is a clear scale separation 
between macroscopic velocity v and molecular 
diffusion u that allows one to treat them separately. 

Using [3], one can write the Green's function as 
an integral over paths that satisfy g(s)=r and 
q(t) — R: 


P(r.s: R,t) -( | PpDgexp (- i ib(7) - |g(7) 


— v(q(T),T) — u(T)| dr)) 4| 


UU 


= (P(r, 5: Rte), [3 


The integration. over the auxiliary field p in [4] 
enforces the delta function of [3]. One passes from 
[4] to [5] by averaging over the Gaussian Brownian 
noise, and from [5| to [6] by calculating Gaussian 
integral over p. 

Generally, exact calculations are only possible for 
Gaussian random processes short-correlated in time- 
like in [5]. The simplest case is the Brownian motion 
when the advection is absent. One then obtains from 
[6] the Gaussian PDF of the displacement: 


P(R,t) = (4rrt) 4e R/s) 7 


which satisfies the heat equation (0, — KV?) P(r, t) — 0. 
The short-correlated case is far from being an exotic 
exception but rather presents a long-time limit of an 
integral of any finite-correlated random function. 
Indeed, such an integral can be presented as a sum of 
many independent equally distributed random numbers, 
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the statistics of such sums is a subject of the central limit 
theorem. One can move beyond the central limit 
theorem considering the correlation time finite (yet 
small comparing to the time of evolution). Such 
generalization is the subject of the large deviation 
theory. Consider some quantity X which is an integral 
of some random function over time t much larger than 
the correlation time 7. At t >> 7, X behaves as a sum of 
many independent identically distributed random 
numbers y;: X = Pe, y; with N x t/r. The generating 
function (e?*) of the moments of X is the product, 
(e*)— eS | where we have denoted (e?)- e 
(assuming that the generating function (e?) exists for 
all complex z). The PDF P(X) is given by the inverse 
Laplace transform (27i)! [e *** NS? dz with the 
integral over any axis parallel to the imaginary one. 
For X x N, the integral is dominated by the saddle point 
zo such that S'(zo) = X/N and 


P(X) xe NH(X/N—(y)) [8] 


Here H = —S(zo) + zoS'(zo) is the function of the 
variable X/N — (y); it is called entropy function as it 
appears also in the thermodynamic limit in statis- 
tical physics. A few important properties of H (also 
called rate or Cramér function) may be established 
independently of the distribution P(y). It is a convex 
function which takes its minimum at zero, that is, 
for X equal to the mean value (X) = NS'(0). The 
minimal value of H vanishes since $(0)— 0. The 
entropy is quadratic around its minimum with 
H"(0) — A^, where A—S"(0) is the variance of y. 
We thus see that the mean value (X) = N(y) grows 
linearly with N. The fluctuations X — (X) on the 
scale O(N'/7) are governed by the central limit 
theorem that states that (X — (X))/N'/* becomes for 
large N a Gaussian random variable with variance 
(y^) — (y? — A as in [7]. Finally, its fluctuations on 
the larger scale O(N) are governed by the large 
deviation form [8]. The possible non-Gaussianity of 
the y's leads to a nonquadratic behavior of H 
for (large) deviations from the mean, starting from 
X — (X)/N ~ A/S"(0). Note that if y is Gaussian, 
then X is Gaussian too for any t, but the universal 
formula [8] with H —(X —N(y)) /2NA is valid 
only for t >T. 


Single-Particle Diffusion 


For the pure advection without noise, the dis- 
placement of the single Lagrangian trajectory is 
R(t) — R(0) = [; V(s) ds, with V(t)= w(R(t),t) being 
the Lagrangian velocity. One can show that V(t) is 
statistically stationary in the frame of reference with 
no mean flow and under statistical homogeneity and 


stationarity of the incompressible Eulerian velocities. 
For & — 0, the mean square displacement satisfies the 
equation 


d acd 
a (RO — RO) =2 | (V0)-Vis)) ds (9 

Jo 
The behavior of the displacement is crucially 
dependent on the Lagrangian correlation time 7 of 


V(t) defined by 
[| (v0) vi) ds = (9) 10 
0 


No general relation between the Eulerian and 
the Lagrangian correlation times has been estab- 
lished, except for the case of short-correlated 
velocities. For times t <7, the two-point function 
in [9] is approximately equal to (V(0)?) = (w>). 
The fluid particle transport is then ballistic with 
(R(t) — R(O)]7)  à2)£?. and the PDF P(R,t) is 
determined by the whole single-time velocity PDF. 
When the correlation time of V(t) is finite (a generic 
situation in a turbulent flow where 7 is of order of a 
large-scale turnover time), an effective diffusive regime 
is expected to arise for t > 7 with ((R(t) — R(0))*) ~ 
2(v*)rt. Indeed, the particle displacements over time 
segments much larger than 7 are almost independent. 
At long times, the displacement óR(t) behaves then as a 
sum of many independent variables and falls into the 
class of stationary processes treated in the previous 
section. In other words, óR(t) for t >> 7 becomes 
a Brownian motion in d dimensions, normally 
distributed with | (8R'(t)&R/(r)) ~ D"t, where the 
so-called eddy diffusivity tensor is as follows: 

Di => | (vov) VO V9)ds — 

0 


=- 
* 


The symmetric second-order tensor D" is the only 
characteristics of the velocity which matters in this 
limit of t >> v. The trace of the tensor is equal to 
(v>)r, that is, equal to the large-time value of the 
integral in [9], while its tensorial properties reflect 
the rotational symmetries of the advecting velocity 
field. If the latter is isotropic, the tensor reduces to a 
diagonal form characterized by a single scalar value 
D.. The main problem of turbulent diffusion is to 
obtain the effective diffusivity tensor given the 
velocity field v and the value of the molecular 
diffusivity x. 


Two-Particle Dispersion in Smooth 
Flows 


Even when velocity v(R,t) is a smooth function of 
the coordinates, Lagrangian dynamics can be quite 


complicated. Indeed, d ordinary differential equations 
R=v(R,t) generally produce chaotic dynamics (for 
d > 3 already for steady flows and for d — 2 for time- 
dependent flows). The tools for the description of what 
is called chaotic advection are similar to those of the 
theory of dynamical chaos. The description consistently 
exploits two simple ideas: to single out the variables 
that can be represented by the sum of a large number of 
independent random quantities and to separate vari- 
ables that fluctuate on different timescales. 

The distance, Ri; =R; — R2, between two fluid 
particles with trajectories R;(t) — R(t;r;) passing at 
t — 0 through points r; satisfies the equation 


Ri» — v(Ri,t) — v(R»,t) [12] 


If the velocity field can be considered smooth on 
the scale R1», then one expands v(R1,t) — v(R5,t) = 
c(t, RA)R15, introducing the strain matrix ø which 
can be treated as independent of Rj». The distance 
thus satisfies locally a linear system of ordinary 


differential equations (we omit subscripts replacing 
Ri» by R) 


R(t) = o(t)R(t) [13] 


This equation, with the strain treated as given and 
R(0) 2 r, may be explicitly solved for arbitrary a(t) 
only in the 1D case 


In|R(t)/r| = In W(t) = [ a(s)ds = X [14] 


When f£ is much larger than the correlation time 7 of 
the strain, the variable X is a sum of N independent 
equally distributed random numbers with N — 7/7 
and one can apply [8]. In the multidimensional case, 
to use the large deviation theory, one introduces the 
evolution matrix W such that R(t) — W(t)R(0). The 
modulus R is expressed via the positive symmetric 
matrix W!W. In almost every realization of the 
strain, the matrix t! In WTW stabilizes at t — oc, 
that is, its eigenvectors tend to d-fixed orthonormal 
eigenvectors f; To understand that intuitively, 
consider some fluid volume, say a sphere, which 
evolves into an elongated ellipsoid at later times. As 
time increases, the ellipsoid is more and more 
elongated and it is less and less likely that the 
hierarchy of the ellipsoid axes will change. The 
limiting eigenvalues 


N = lim r^! In|Wf, 5] 


are called Lyapunov exponents. The major property 
of the Lyapunov exponents is that they are realiza- 
tion independent if the flow is ergodic (i.e., spatial 
and temporal averages coincide). The relation [15] 
states that two fluid particles separated initially by r 
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pointing into the direction f; will separate (or 
converge) asymptotically as exp(A;t). The incom- 
pressibility constraints det(W) —1 and $;A;—0 
imply that a positive Lyapunov exponent will exist 
whenever at least one of the exponents is nonzero. 
Consider indeed 


E(n) = lim t^! In([R(t)/r]|") [16] 


whose derivative at the origin gives the largest 
Lyapunov exponent A;. The function E(z) obviously 
vanishes at the origin. Furthermore, E(—d) — 0, that 
is, incompressibility and isotropy make that (R ^) is 
time independent as t — oc. Apart from »— 0, — d, 
the convex function E(z) cannot have other zeroes if 
it does not vanish identically. It follows that dE/dn 
at 7 — 0, and thus Aj, is positive. A simple way to 
appreciate intuitively the existence of a positive 
Lyapunov exponent is to consider the saddle-point 
2D flow v, — Ax, v, — —Ay with the axes randomly 
rotating after time interval T. A vector initially at 
the angle @ with the x-axis will be stretched after 
time T if cosó > [1-- exp(2AT)| "^, that is, the 
measure of the stretching directions is larger 
than 1/2. 

A major consequence of the existence of a positive 
Lyapunov exponent for any random incompressible 
flow is the exponential growth of the interparticle 
distance R(t). In a smooth flow, it is also possible to 
analyze the statistics of the set of vectors R(t) and to 
establish a multidimensional analog of [8]. The idea is 
to reduce the d-dimensional problem to a set of d 
scalar problems for slowly fluctuating stretching 
variables excluding the fast fluctuating angular degrees 
of freedom. Consider the matrix I(t)= W(t)W' (t), 
representing the tensor of inertia of a fluid element 
such as the above-mentioned ellipsoid. The matrix is 
obtained by averaging R'(t)R/(t)d/@ over the initial 
vectors of length / and l(0)— 1. Introducing the 
variables that describe stretching as the lengths of the 
ellipsoid axis e^", ... , e?^!, one can deduce similarly to 
[8] the asymptotic PDF: 


P(p,..., pa: t) 
x expl-£ H(pi/t — A1,..., pa-i/t — Aa) 
x 0(p1 — pz) ...O0(pa-1 — pa) 
te ee [17] 


The entropy function H depends on the statistics 
of ø. In the ó-correlated case, H is everywhere 
quadratic: 


d 
H(x)xd'S xj  X«d(d-2i«1) [18] 
i=] 
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Two-Particle Dispersion in Nonsmooth 
Flows 


To consider dispersion in the inertial interval of 
turbulence, one should assume óv(r, t)| x ^, where 
generally o < 1. Rewriting then eqn [12] for the 
distance between two particles as R —óv(R, t), we 
infer that dR?/dt — 2R - 6v(R, t) x R!*^. It suggests 


R(t)’ * — R(0) a i [19] 


For large t, R(t) x t!/! ?, with the dependence of 
the initial separation quickly forgotten. Of course, 
for the random process R(t), relation [19] is of the 
mean-field type and should pertain (if true) to the 
large-time behavior of the averages ((R(t)’) œx 
t/1-9. for p>0) implying their super-diffusive 
growth, faster than the diffusive one œ ¢?/*. The 
power-law scaling may be amplified to the scaling 
behavior of the PDF of the interparticle distance, 
P(R,t) - AP(AR, A! °t). The power-law growth of 
the second moment, (R(t)”)« t, is the celebrated 
Richardson dispersion relation, which was the first 
quantitative phenomenological prediction in devel- 
oped turbulence. It seems to be confirmed by 
experimental data and the numerical simulations. It 
is important to remark that, even assuming the 
validity of the Richardson relation, it is impossible 
to establish general large-time properties of the PDF 
P(R;t) such as those for the single-particle PDF of 
the distance between two particles. This is because 
the correlation time of the Lagrangian velocity 
difference, R/dv(R) x (R2) 1/9 «X t, is comparable 
with the total time of the process. 

It is instructive to contrast the exponential growth 
[16] of the distance between the trajectories with the 
power-law growth [19]. In a smooth flow, the closer 
two trajectories are initially, the more time is needed 
to effectively separate them. In a nonsmooth 
turbulent flow, the trajectories separate in a finite 
time independent of their initial distance R(0), 
provided that the latter is also in the inertial range. 
This explosive separation of trajectories results in a 
breakdown of the deterministic Lagrangian flow 
since the trajectories cannot be labeled by the 
initial conditions. That agrees with the fundamental 
theorem stating that the ordinary differential equa- 
tion R=v(R,t) does not have unique solution 
if v(r,t) is non-Lipschitz As shown by the 
example of the equation x = |x|" with two solutions 
x —[(1 — o)t] ? and x 20 both starting at zero, 
one should expect multiple Lagrangian trajectories 
starting or ending at the same point for velocity 
fields with o < 1. Even though the deterministic 
Lagrangian description breaks down, the statistical 
description is still possible and one can make 


sense of propagators like P(r,s;R,t|v). They are 
expected to be weak solutions of the equation 
[ð — V - w(R,1)|P(r, s; R,t|v) =O in the nonsmooth 
case. According to this assumption, the Lagrangian 
trajectories behave stochastically already in a 
given velocity field and for negligible molecular 
diffusivity — and not only due to a random noise or 
to random fluctuations of the velocities. 

The general conjecture about the existence and 
diffuse nature of propagators is known to be true for 
the Gaussian ensemble of velocities decorrelated in 
time (Kraichnan 1968): 


(vilr, t)v;(r',t')) = 2é(t — ')Dji(r — v) [20] 


Here the Lagrangian velocity v(R,t) has the same 
white noise temporal statistics as the Eulerian 
velocity v(r,t) for fixed r and the displacement 
along a Lagrangian trajectory R(t)— R(0) is a 
Brownian motion for all times. To model non- 
smooth velocity field of turbulence, we choose 
D' (r) = Do6" — (1/2)d"(r) and 


d'(r) = Di((d — 14- £)ó?r* —&r'r/r^^] — [21] 
Here Do gives the eddy diffusivity of a single fluid 
particle (discussed earlier), whereas dj(r) describes 
the statistics of the velocity differences. For 0 < £ < 2, 
the Kraichnan ensemble is supported on the velo- 
cities that are Hólder continuous in space with a 
fixed exponent a arbitrarily close to €/2. It mimics 
this way the main property of turbulent velocities. 
The rough (distributional) behavior of Kraichnan 
velocities in time, although not very physical, is not 
expected to modify essentially the qualitative prop- 
erties of propagators (it is the spatial regularity, not 
the temporal one, of a vector field that is crucial for 
the uniqueness of its trajectories). 

In exactly the same way as one derives [6] and |7] 
from [4], one gets 7(R, t) = |&| ^ (4zt) 4! e RR//At 
where (871); = D;(0) + &ój. In much the same way 
one can examine the two-particle PDF. The PDF 
P»(r, s; R,t) of the distance R between two particles 
satisfies the equation 


(à, — Mo)Po(r,s;R,t) = 6(t—s)6(r — R) [22] 


where M; = —D,(d — 1)r! 40,r^- !^*8, and [22] can 
be readily solved: 


Ré | 
lim P2(r,s; R. t) x v 
r0 lt s[^^ ^ £) 
RAE 
x exp | —const. [23] 
jt — s| 


That confirms the diffusive character of the limiting 
process describing the Lagrangian trajectories in 
fixed non-Lipschitz velocities: the endpoints of the 
process stay at finite distance when the initial points 
converge. The PDF [23] changes from Gaussian to 
log-normal when € changes from O0 to 2. The 
Richardson dispersion (R?^(t)) x t? is reproduced 
for £— 4/3. 


Multiparticle Propagators 


In studying multiparticle statistics, an important 
question is what memory of the initial configuration 
remains when final distances far exceed initial 
ones. To answer this question, one must analyze 
the conservation laws of turbulent diffusion. 
Many-particle evolution in nonsmooth velocities 
exhibits nontrivial statistical integrals of motion 
(martingales) that are proportional to the positive 
powers of the distances. The integrals involve 
geometry in such a way that the distance growth is 
balanced by the decrease of the shape fluctuations. 
The existence of multiparticle conservation laws 
indicates the presence of a long-time memory and is 
a reflection of the coupling among the particles due 
to the simple fact that they are all in the same 
velocity field. The conserved quantities may be easily 
built for the limiting cases. Already for a smooth 
velocity, the d-volume ej; ; Rj;... R7, is indeed 
preserved for (d+ 1) Lagrangian trajectories. In the 
opposite case of a very irregular velocity, the fluid 
particles undergo a Brownian motion. The distances 
between the Brownian particles grow according to 


(RZ,(t)) — R2,(0) 4- Dt. The statistical integrals 
of motion are (R2,-— R5); (2 (d + 2)RZ,, R7, — 


d(R$ + R5) and an infinity of similarly built 
harmonic polynomials (zero modes of Laplacian). 

The statistics of the relative motion of N particles 
is described by the joint PDF averaged over rigid 
translations: — PX(r, s; R, t) = J Pus, r; R + p, t) dp. 
For smooth velocities, 


n= f (Ia R, +p- 


Such PDF depends only on the statistics of the 
evolution matrix W(t) discussed earlier. Under the 
evolution governed by W(t), all distances between 
points grow exponentially for large times while their 
ratios R,,/R,; tend to a constant. For whatever initial 
positions, asymptotically in time, the points tend to be 
situated on the line. Note that the existence of 
deterministic trajectories leads to the collapse property 
limyy ry, PER t) = PW (n; R’; t) SRy — RN), 
where R' = (Ri, ..., RN. 1). 


Pel(r.0: Wit )rn) ) dp [24] 
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The long-time asymptotics of the propagators in 
the nonsmooth case can be found explicitly for the 
Kraichnan ensemble of velocities: 


(8, + Mn) PE (r,s; R, t) = 6(t—s)6(R—r) [25] 


My = X d” (rnm) VV ji [26] 


n< m 


When initial points get close or final points far 
apart and time gets large, the multiparticle PDF is 
factorized: 


. rel : - ad 
lim PX Or, 0: R,t) = » AUfa(r)g(R,t) [27] 


where f; must be taken as zero modes of My and its 
powers while 0:23 = —Mwg2. = semgrkabie fea- 
ture of the zero modes of M\ is that they are 
conserved in mean by the Lagrangian evolution: 


a,(f(R(t))) = / f(R)MnP%!(r, 0: R, t) dR' 


= | rx 6.0; R.t)M\f(R) dR’ = 0 
The scaling exponents of the zero modes depend, in 
a nontrivial way, on the number of particles N. For 


€ «& 1 and d > 1, one finds 


GN 


"T 


(2,4) = S |28] 


Passive Scalar 


For practical applications, for example, in the 
diffusion of pollution, the most relevant quantity is 
the average (0(r,t)) which can be expressed via the 
single-particle propagator. As discussed earlier, for 
times longer than the Lagrangian correlation time, 
the particle diffuses and (0) obeys the effective heat 
equation 


O,(0(r,t)) = (DË + nbz) V;V;(8(r.t)) [29] 


with the eddy diffusivity D" given by [11]. The 
simplest decay problem is that of a uniform scalar 
spot of size L released in the fluid. Its averaged 
spatial distribution at later times is given by the 
solution of [11] with the appropriate initial condi- 
tion. On the other hand, the decay of the scalar in 
the spot is governed by the multipoint Lagrangian 
propagators. Taking the point of measurement 
inside the spot, consider the single-point moment 
(@%)(t) described by [2]. If there is no molecular 
diffusion and the trajectories are unique (spatially 
smooth velocity), particles that end at the same 
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point remained together throughout the evolution 
and all the moments are preserved. On the contrary, 
when velocity is nonsmooth and the propagator is 
diffusive, we expect the decay even at the limit x — 0. 
This is an example of the so-called dissipative 
anomaly: the symmetry f£ — —t remains broken 
even when the symmetry-breaking factor & goes 
to zero. Consider a spherical spot of 0 released in 
a spatially smooth incompressible 3D flow with 
^4»5225»50»A4,. During the time less than 
ta =à; In (L/r4), diffusion is unimportant and 0 
inside the spot does not change. At larger time, the 
dimensions of the spot with negative Lyapunov 
exponents are frozen at ry, while the rest keep 
growing exponentially, resulting in an exponential 
growth of the total volume exp (p; + p2). That leads 
to an exponential decay of scalar moments averaged 
over velocity statistics: ([0(£)]" ) œx exp (—ynt). The 
decay rates yn can be expressed via the PDF [18] of 
stretching variables p;. Since 0 decays as the inverse 
volume, 


(oa~) xX J dpidpo» exp|-£H(p1/t — M, p2/t — A2) 
— N(pi + pz)] [30] 


At large t, the integral is determined by the saddle 
point. At small N, the saddle point lies within the 
parabolic domain of H so yy increases with N 
quadratically. At large N, the main contribution is 
due to the realization with smallest possible spot of 
size L so yn saturates. 

For the decay in incompressible nonsmooth flow, 
using the Kraichnan model one gets 


(P"() = / Pan(0; R; —1) Con (r/?-9R,O)dR. [31] 


When Jo= f Co(r,t) dr Z0, the function pal (2—) 
C5 (1/99 y, 0) tends to Joó(r) in the long-time limit 
and [31] is reduced to 


(0? (£)) = (2n — 1)! geal) 


x J 7.6; Ri, Ris.. -Rn;Ra;—1)dR [32 


The decay is self-similar: P(t, 0) = tt 2-9 
Q(14/22—98). That means that the PDF of 0/ve 
is asymptotically time independent, with «c(£)— 
K((V0)) being time-dependent (decreasing) dissipa- 
tion rate. This should be contrasted with the lack of 
self-similarity for the smooth case. 


One can also consider steady state of 0 pumped by 
a source ó(r, t): 


0,0 + (v- V) -- kA0 = ¢ [33] 


Assuming that pumping is white Gaussian with a 
zero mean and variance ó(ri,t1)ó(ro, t2) — x(r12)ó 
(t2 — ti), ri; — rj — rj, one can express the correlation 
functions via the multiparticle propagators. For 
example, assuming zero conditions at the distant 


past and space homogeneity, one gets 


Cor; £) = f ar | PUR, r,t\x(R)dR [34] 


The function y(R) is nonzero within the correlation 
scale L of the pumping which restricts integration to 
R(t)<L. For smooth velocity, this gives 
F> (r) = |A3| ! x(0) In (L/r) at r< L. For nonsmooth 
velocity, the statistics of scalar fluctuations at 
small scales is described by the set of structure 
functions $y(r) = ([0(r) — 0(0)] ) ex with the 


scaling exponents determined by the zero 
modes (see Falkovich et al. (2001)). Therefore, 
existence of Lagrangian statistical invariants 


explains the anomalous scaling of passive scalar 
(here, anomaly means that scale invariance broken 
by pumping is not restored even when the pumping 
scale goes to infinity). 


See also: Anomalies; Intermittency in Turbulence; Large 
Deviations in Equilibrium Statistical Mechanics; Lyapunov 
Exponents and Strange Attractors; Random Walks in 
Random Environments; Stochastic Differential Equations; 
Turbulence Theories. 
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Introduction 


Large deviation theory (LDT) deals with the study 
of probabilities of extremely rare events. As an 
example, consider the case of independent identi- 
cally distributed random variables o1,...,0N with 
the mean value E(o;)=m. Then the typical devia- 
tions of the sum My =o; +---+ on from its mean 
value Nm are of the order of VN, while in LDT we 
study the probabilities of the deviations which are 
linear in N. In “good” cases we know that for b > 0 


Pr{ My — Nm > bN} ~ exp{—I(b)N} 


as N — oc 


[1] 


where I(-) > 0 is the “rate” function. 

Questions of LDT are very natural in statistical 
mechanics, and they have deep physical meaning, 
notwithstanding the fact that the corresponding 
events are rare. One reason is that (some) rare 
events in the grand canonical ensemble become 
typical events in the canonical ensemble. 

An interesting feature of LDT in statistical 
mechanics is that the behavior |1] of LD is not 
universal, and sometimes is replaced by a nonclassi- 
cal one: 


Pr{My — Nm > bN} ~exp{-I(b)N’} [2] 


with v < 1. That usually happens in the “phase 
transition" regime, and then the quantity l(b), as 
well as the exponent v, have very much to do with 
the geometry of a droplet of one phase formed inside 
the other. 

Below, we will illustrate all these features on the 
example of the Ising model. 


The Ising Model in the Finite Box 


Our random variables o, will take values +1, with 
x € 74, They are called spins. For every finite box 
A C Zf, we will define Gibbs states in A. To do this 
we need the Hamiltonians 


Hael p? Ox0y — » Oxy 


aer 
Here, £ is some spin configuration on Z^, which is 
called “boundary condition,” while o € Q4 is any 
spin configuration in A. 


The “grand canonical Gibbs measure” pA. 7 in A 
with boundary condition € at inverse temperature 
B-—T-! is given by 


uasta) = Zier expC-8Has(o)) i3] 


where 


LAET = I» exp(—8H 4«(o)) 


a€Ox 


is called “partition function”; it makes the measure 
[3] to be a probability distribution. 

The boundary condition €=+1(—1) will be 
denoted by +(—). For every value of T, the Gibbs 
measures Ha, r With (+)-boundary condition in 
the cubic box A(/) of size / converge, as | — oo, to 
the probability measures that we will denote by 
|, T. If the two happen to be different, then ji. T is 
called the (+)-phase, and u- r the (—)-phase. That 
happens to be the case iff the temperature T is lower 
than the critical temperature T; = T. (d). The critical 
temperature depends on dimension; T.(1) — 0, while 
T.(d) > 0 tor d > 2. The expectation 


En. (00) = m(B) 


is called spontaneous magnetization; (3) > 0 iff 
gov. 


LD Properties of the Gibbs States Ai), 7 


In what follows, we will discuss the LD properties of 
the sum M4—204-::: 0j, where the spins 
0x, X € A, are distributed according to the Gibbs 
state ua, ., T. Note that IZ, (og) = — m/({). 


Classical Case 


If we look on the LDs of the sum M4 when the 
temperature T is high enough (in which case the 
limiting states jz, 7 and ji. r coincide), or else if the 
temperature is low, and the deviations are negative — 
that is, we consider the events M4 + |A|m(T ^!) < b|A| 
with b < 0 — then their probabilities behave classically: 

There exists a (high) temperature Tọ such that if 
T 2 T6 then 


lim : Pri Ma t |A|jm(T ^!) € b|A|] 
az! |A| | 
= —H(b) for b «0 [4] 
T~!) > DIA 
we X m(T )2 | | 
= —H(b) for b>0 [5] 
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where the function Ir(b) > 0 is strictly concave on 
the segment (m(T~') — 1, m(T-!) +1). It vanishes 
at only one point b=0. 

There exists a (low) temperature T, such that if 
T < Ti, then the relation [4] holds with the function 
I7(b) > 0 strictly concave on the segment (m(T !) — 
1,0). The limit [5] also does exist, but it can vanish 
once we are in the phase transition region. In order 
to see some nontrivial behavior, we have to change 
the normalization 1/|A| in [5]. 


Nonclassical Case 


The proper normalization happens to be the surface 
term, 1/|A[V- ?/4. 

There exists a temperature T, such that if T < Tj, 
then 


1 - 


— —Wz(b) for b»0 6) 


The function Wr(b) obeys Wr(b) = b'4-U/di.. with 
wr > 0, provided the value b > 0 is not too large: 
b < b(d), where b(d) is some constant, depending on 
the dimension and temperature; one can show that 
b(d) > 1/24. For larger b’s the dependence is more 
complex. 

The key object here is the constant wr. To obtain 
it, one has to solve the following variational 
problem. Let rr(0), 0 € S^! be the surface tension 
between the (+)-phase and the (—)-phase of the Ising 
model at the temperature T. Then, for every closed 
compact (hyper)surface M4^'! c R?, we define its 
surface energy as 


Wr(M) = J rr (6,) ds 


where 0, is the normal vector to M at s € M. The 
functional W7(M) has the meaning of the energy of 
the M-shaped droplet of the (+)-phase floating in the 
(—)-phase. It is called the “Wulff functional." Let 
Wr be the surface which minimizes Wy(-) over all 
the surfaces enclosing the unit volume. Such a 
minimizer does exist and 1s unique up to translation. 
It is called the “Wulff shape." The value wy is just 
the surface energy of the Wulff shape: 


The value b(d) is defined as the maximal value of 
b’s, for which the dilatation 5/798 can fit into the 
unit cube. For higher values of b, the shape of the 
(+)-phase droplet in the cube with (—)-boundary 
condition is deformed by its walls, so its surface 
energy is given by a more complicated variational 
problem. 


Moderate Deviations and the Droplet 
Condensation 


The reason behind the different order of the 
probabilities of the events My, +|A\m(T~') < 
b|A|, b « 0, and My + |Alm(T~') > b|A|, b> 0, at 
low temperatures is the following. À typical config- 
uration contributing to the first event contains many 
small droplets of (—)-spins, of size € In|A|, floating 
in the sea of (+)-spins. On the contrary, in the case 
of the second event a typical configuration con- 
tains, in addition to small droplets, one large 
droplet of the size of A. It has a random shape, 
but in the limit A — Z^ that shape converges to a 
nonrandom one, which happens to be the Wulff 
shape 97. (The precise meaning of that statement 
depends on dimension; in case d=2 the conver- 
gence holds in the Hausdorff metrics, while in 
higher dimensions it is known only in L! sense.) 
That statement makes the following question 
natural: consider the event 
My — E(Ma) > JAI", 0<a<1 
For which a should we expect, in addition to 
microscopic (+)-droplets of size < In|A|, the forma- 
tion of a large droplet, of volume ~ |A|", in a 
corresponding typical configuration? In other 
words, how many extra (+)-spins should we pump 
into our systems in order for the microscopic 
droplets to condense into a macroscopic one? (In 
the formulation of this question, we have to use the 
expectation (Ma) instead of the asymptotically 
equivalent quantity —|A|m(T~'). The difference, 
E(MA4) + |Ap( T!)  O(JOA]), being irrelevant in 
the LD case, becomes significant here.) 
The answer is the following: 


e if a<d/(d+1), then a typical configuration 
contains only microscopic droplets; 

e if o d/(d-- 1), then any typical configuration 
contains, in addition to microscopic droplets, one 
large droplet of volume ~ |A|". 


Therefore, the condensation happens at the value 
œ = d/(d + 1). This picture has its counterpart in the 
behavior of the probabilities of *moderate deviations" 
(MD), that is, events when M4 + |A[|rms( T!) > |A|°: 


eif a<d/(d+1), then the deviation is due to 
independent fluctuations of sizes of many small 
droplets, and the usual Gaussian behavior holds: 


Pr(M4 — E(Ma) > |A|^] 


or u (|A|*)? — ex |-e|AP^ Ü 
Pj 2Var(M,) p 


e if o 7 d/(d -- 1), then the deviation is due to the 
formation of a large droplet, and so 


en - expy —e |A[ 47 V 


Note that the two estimates match at o — d/(d + 1). 


Pr{ M4 = E(MA) 2 A 


Other Questions 


There are many related questions; some are partially 
solved, others are widely open, if considered on a 
rigorous mathematical level. 

One can ask about the asymptotic behavior of 
probabilities of the events like 


Mx, — E(Ma) = bA 


where the values b; lie in the LD or MD region. The 
difference between such questions and those treated 
above is of the same nature as the difference between 
the integral and the local limit theorems. Partial answers 
to them are given in Dobrushin and Shlosman (1994). 

Many results about the Wulff shape and its 
relation to the Ising model are known, starting by 
Dobrushin et al. (1992). Some are still challenging. 
One such question concerns the so-called roughening 
phase transition. It is known rigorously that the 
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introduction 


Topological strings have been well studied since 
they were introduced in the early 1990s. Essen- 
tially, they are simplified string theories that 
capture the information about a sector of the full 
(or “physical”) string theory. Thus, while sharing 
many of the structural features of usual string 
theory, they hold out the possibility of being 
amenable to explicit calculations. This is especially 
true with regard to stringy quantum corrections 
(the higher genus contributions from the point of 
view of the string world sheet), which are normally 
rather intractable in the full physical string theory. 
This has allowed them to play a useful role in 
enhancing the understanding of string theory and 
many of its mysterious quantum properties, such as 
the various dualities. 
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Wulff shape Wr in the d > 3 Ising model has flat 
facets at low temperatures T. It is believed that such a 
feature holds true only for T < Tg, where the 
roughening temperature Tq is strictly less than the 
critical temperature T.(d) for d — 3. At the tempera- 
tures T € (Tg, T-(3)), the Wulff shape Wr does not 
have facets. This conjecture seems to be very difficult. 

The question about the typical behavior of 
the MD of the Ising model at the threshold 
value Ma — ERMi) — AJ% (d+1) — was recently 
answered in Biscup et al. (2003). 


Further Reading 


Biscup M, Chayes L, and Kotecky R (2003) Critical region for 
droplet formation in the two-dimensional Ising model. 
Communications in Mathematical Physics 242: 137-183. 

Dobrushin RL and Shlosman SB (1994) Large and moderate 
deviations in the Ising model. In: Dobrushin RL (ed.) 
Probability Contributions to Statistical Mechanics, Advances 
in Soviet Mathematics, vol. 18, pp. 91-220. Providence, RI: 
American Mathematical Society. 

Dobrushin RL, Kotecky R, and Shlosman SB (1992) Wulff 
Construction: A Global Shape from Local Interaction. AMS 
Translations Series. Providence, RI: American Mathematical 
Society. 


In particular, in the last several years, topological 
strings have served as an important laboratory for 
testing and understanding the connection between the 
large-N expansion of gauge theories and closed- 
string theories. In this article we will sketch how 
this connection is illustrated in a duality between 
large-N Chern-Simons gauge theory and closed 
topological string theories. We will survey the origin 
and current status of these developments and 
indicated some of its remarkable mathematical 
ramifications. 


Background 


In order to appreciate the conjecture relating the 
Chern-Simons theory and topological string the- 
ories, we need to go back to the seminal work of 
't Hooft, who pointed to the connection between the 
large-N expansion of gauge field theories and string 
theories. 

The starting point is a gauge field theory (with, 
say, gauge group U(N)), where we take the limit of 
the rank N of the gauge group to infinity (see Brezin 
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and Wadia (1993) for a collection of papers on the 
topic). The idea is then to make an expansion in 
inverse powers of N for various observables such as 
the free energy and correlation functions. For 
definiteness, let us take a gauge theory containing 
only gauge fields A in the adjoint representation of 
U(N). The quantum theory is (schematically) defined 
by the path integral 


Ze J (Dales 1] 


For now, the action S(A) for the gauge fields is left 
unspecified. It could be either the usual Yang-Mills 
functional or of the Chern-Simons form which we 
describe below. S(A) is normalized in such a 
way that the gauge coupling constant, denoted 
by «, only appears via an overall multiplicative 
factor of 1/x. 

Then the expression, for instance, for the free 
energy F— In Z has an expansion in a power series 
in 4, whose individual terms are given by the usual 
Feynman diagrammatic rules. Namely, we have is 
a sum over connected vacuum diagrams (those 
without any external legs) formed from the 
vertices determined by the action S(A). Even 
without going into the details of the action, we 
can write down the dependence on N and x 
coming from a diagram with þh faces, V vertices, 
and E edges. Every edge is associated with a 
propagator (arising from the inverse of the quad- 
ratic term in $(A)) and thus comes with a weight 
of &. Every vertex, coming from the cubic and 
higher-order terms in $(A), comes with a factor of 
«|. There is a factor of N coming from summing 
over the color indices that circulate in every loop 
(face). We thus get a weight of N^&-V and so the 
total contribution to the free energy can be 
organized as 


OQ 
* Jup— 
Pa } | C pN? m 2+h 
g=0,h=1 


oc 
- `S am IN? 28 32g 2b [2] 
g=0,b=1 | 


Here we have defined Az £N, the °t Hooft 
coupling, as the combination that will be kept 
fixed when taking the limit of large N. We have 
also used the fact that V — E + hb —2 — 2g, where g 
is the number of handles of the closed two- 
dimensional surface one can associate with the 
Feynman diagram. (It is best to visualize the 
Feynman diagram as a “fatgraph” which forms 
the skeleton of a closed Riemann surface.) The 
coefficients Cp represent the sum of the 


contributions from all genus g diagrams with p 
boundaries and depend on the details of the 
theory. 

We note that the reorganization of the contribu- 
tions to the free energy is reminiscent of the genus 
expansion in a string theory. In fact, eqn [2] as it 
stands looks like an open-string expansion on world 
sheets with g handles and h boundaries. Indeed, in 
many cases the gauge theory arises as a limit of an 
open-string theory. (Recall that a massless nonabe- 
lian gauge boson is one of the low-lying excitations 
of an open-string theory.) So the double expansion 
in terms of g and h is not too surprising. 

However, the interesting conjecture of "t Hooft 
is in the relation to closed-string theory. Note 
that the expansion in inyerse powers of N depends 
only on the number of handles g. In fact, 1/N 
seems to play the role of closed-string coupling in 
that it suppresses higher genus diagrams. The total 
contribution to a given genus g comes from 
summing over all the holes b in eqn [2], for 
example, 


F= Y NPR (A) [3] 


g—0 


The conjecture is to identify this with a closed-string 
expansion in which F,(A) is a closed-string ampli- 
tude on a genus g Riemann surface. (In carrying out 
the sum over the holes, we have assumed the 
existence of a radius of convergence. This is 
plausible since the number of planar diagrams 
(g — 0), for instance, grows only exponentially with 
the number of holes.) The question, since 't Hooft, 
has been: what is this closed-string theory? In other 
words, what is the background on which the closed 
string propagates? 

A breakthrough came from Maldacena's identi- 
fication of the background for the particular case of 
U(N) N —4 supersymmetric Yang-Mills theory. 
His conjecture was that this theory is dual to type 
IIB closed-string theory on AdS; x S? with a 
curvature scale set by A and with closed-string 
coupling x A/N. This proposal passed a number of 
nontrivial checks and is widely held to be true. It 
also stimulated the search for closed-string duals to 
other large-N gauge theories. 

In what follows, we explain how the conjecture of 
't Hooft has a nice realization in the case of three- 
dimensional U(N) Chern-Simons gauge theory on 
S?. The dual closed-string theory, obtained by 
summing over the holes, turns out to be the 
A-model topological string on the (six-dimensional) 
resolved conifold background. The parameter A 
maps into a Kahler parameter in the closed-string 


geometry and once again the closed-string coupling 
Is xA/N. 


The Large-N Expansion of Chern-Simons 
Theory 


Nonabelian Chern-Simons theory is based on the 
following action functional for the U(N) gauge 
connection A: 


Scs(A) = f tr(AAdA--2AAAAA) [4] 
4r M i 

Here M is a three-dimensional manifold. k is called the 
level and is integer quantized for the path-integral 
equation |1] to be single valued. Note that, classically, 
« as defined earlier is proportional to 1/k. One of the 
nice properties of Scs(A) is that it is independent of the 
metric on M, unlike the Yang-Mills functional. Thus, 
it is a prototype of a topological field theory. In fact, 
the observables in this theory capture topological 
information about the 3-manifold M. 

Witten succeeded in quantizing the Chern- 
Simons theory by relating its Hilbert space to the 
space of conformal blocks in the two-dimensional 
U(N) WZW theory. (for more details on the 
quantization, see Chern-Simons Models: Rigorous 
Results). Here, merely the answers for various 
observables in the theory will be quoted. In 
particular, the free energy for the theory on $? 
can be written in a completely explicit form: 


Z(S^, N, k) = exp F(S?, N, k) 


l N- l , Ju N 
= (N+ kN? H (2 sin x) [5] 


One of the features one observes in the quantization 
is the shift (“finite renormalization”) of the effective 
level from k to k+ N. This can also be seen in 
perturbation theory. Consequently, while taking the 
large-N limit, the natural quantity to be held fixed 
as the °t Hooft coupling is À — 27 N/(k +N). 

We can then carry out the 't Hooft expansion in 
powers of \ and 1/N, of expressions, for example, 
for the free energy in eqn [5]: 


EN 3 
F= 3 (toe ^-3)-5 


WY lisi [6] 
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The coefficents F, ;, are nonzero only for even h and 
are given by 


2c(b — 2) 
[gj m > E E T 
(22)^ ^ (b — 2)b(b — 1) 
_ (b) 
kp b 
6(2x) b 7 
xem, ) 
eae. ar aT 
(2m) h 
X EE AM 
2g(2g — 2) 


where the last line is for g > 1. B», are the Bernoulli 
numbers. The first few terms in eqn [6] are 
nonperturbative contributions which do not have a 
Feynman-diagram interpretation. The power series in 
A is, on the other hand, of the same form as eqn [2]. 
In fact, there is an open-string interpretation for these 
terms which will be considered later. 

Given the explicit form of the answer, we can 
carry out the summation over the holes b. Using 
some resummation techniques, we find 


F=} (-i xj) * ^ E. [8] 


with ¢ = i\ and 


—1)5|B54 B», 7| 
2g(2g —2)(2g — 2)! 


=, 2g—3 e" 
gQg — mu yl d 9) 


(This expression is for g > 1. There are very similar 
expressions for genus 0 and 1 as well.) With the 
identification of the string coupling g, — — it/N,the 
F,(t) actually turn out to be the genus g amplitudes 
of a closed topological string, in line with the 
general expectation of the previous section. This is 
explained in the following. 


F,(t) = 


Topological Strings 


Physical strings are defined in terms of a two- 
dimensional sigma model (the theory on the world 
sheet) made reparametrization invariant by coupling 
to two-dimensional gravity. Topological strings are 
simpler versions of this, where the world-sheet 
theory is a two-dimensional topological sigma 
model. The latter is defined in terms of a sigma 
model (usually with N =2 superconformal symme- 
try) with an additional twist which drastically cuts 
down the physical states to a subset of the low-lying 
modes. There are actually two inequivalent twists 
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denoted by A and B, respectively, but we will 
restrict to the A twist in this article. One of the 
simplifications of the A twisted sigma model is that 
the path integral localizes to contributions from only 
holomorphic maps from the world sheet to the 
target space (which will be taken to be a Calabi-Yau 
3-fold). Also, all the observables in the theory 
depend only on the Kahler parameters of the target 
space and not the complex structure parameters (see 
Topological Sigma Models as well as the book by 
Hori et al. (2003) for more details). 

The topological string theory is defined by an 
appropriate integration of the observables of the 
topological sigma model over the moduli space of 
the world-sheet Riemann surface. For instance, the 
free energy of the string theory at genus g is given by 


. 6g—6 
F(t =f < Tem > 10] 


Here b is one of the reparametrization ghost fields 
on the world sheet and u; are Beltrami differentials. 
The averaging is with respect to the world-sheet 
sigma model for the Calabi-Yau target X, as the 
subscript indicates. We have also shown the depen- 
dence of F, on the Kahler parameters of X, 
collectively denoted by t. The localization to the 
holomorphic maps in the path integral implies that 
F?P(t) takes the generic form 

=|| (1 


FoF (i = 2, Ny, iq’ 


Here g;=e " and s; are the integer coefficents 
labeling the element 3 € H*(X). This is in the same 
basis of two cycles of H^(X) in terms of which the 
complex Kahler parameters t; are expressed. (Recall 
that in string theory the Kahler parameters are 
complexified because of the presence of an addi- 
tional 2-form field.) The N,, are the Gromov- 
Witten invariants for X and are in general rational 
numbers. For nonzero 9, the corresponding terms 
are often called world-sheet instanton contributions 
since they correspond to topologically nontrivial 
maps from the world sheet to 2-cycles in the target 
space. The all-genus free energy of the topological 
string is also defined to be 


ay. gk FOP [12] 


g=() 


FP (t g) 


with g, being the string coupling. 

Since topological strings are related to physical 
strings by a twist on the world sheet, it is.natural 
that topological string computations are related to 
computations in the physical string theory. In fact, 


as shown by Antoniadis, Gava, Narain, and Taylor 
as well as Bershadsky, Cecotti, Ooguri, and Vafa, 
observables such as F?P(t) are related to special 
superpotential terms in "the type II string compacti- 
fication on the Calabi-Yau X. Using duality to 
M-theory, these answers were reinterpreted by 
Gopakumar and Vafa in terms of contributions 
coming from BPS states of wrapped D-branes. This 
gives a completely different perspective on topolog- 
ical strings. For instance, the all-genus free energy 
can naturally be reorganized as 


FPP (s. gs) 
X oo .1 d . 2g—2. 
23» 0. NC 
g—0 B d=1 


where the 5 are integer invariants (Gopakumar- 
Vafa) since they count the number of BPS states. 
This will prove to be useful in extracting all-genus 
answers for topological string amplitudes, which is 
normally quite difficult using the perturbative 
definition given earlier. 


The Large-N Dual to Chern-Simons 
Theory 


We are now in a position to state the duality 
(Gopakumar and Vafa 1999) between large-N 
Chern-Simons theory and topological strings in a 
precise way. The conjecture is that the closed 
topological string theory on the S^ resolved conifold 
geometry is exactly dual to the U(N) Chern-Simons 
theory on $^. The resolved conifold geometry is a 
noncompact Calabi-Yau 3-fold described by the 
equation 


Xy = zi = 0 [14] 


where the singularity is resolved by a 2-sphere 
x= pł, w= pý. The resulting space can thus be 
characterized as an O(-1) + O(-1) bundle over P!. 
It has a single Kahler parameter t for the nontrivial 
2-cycle of the S^. In addition, the string theory is 
characterized by the string coupling g,. These 
parameters map on the gauge theory side to the 
't Hooft parameter \ and N via the dictionary 


t= 1, 115] 


Bs = N 

This conjecture can be checked by comparing 
various exact calculations in the Chern-Simons 
theory with corresponding calculations in the topo- 
logical string on this conifold background. The use 
of the duality to M-theory enables us to make exact 
computations on this side as well. One of the 


nontrivial checks of this duality comes from a 
comparison of the free energies. In eqns [8] and 
[9], we already have carried out the sum over the 
holes in the Chern-Simons theory and organized it 
as a closed-string genus expansion. Note that these 
expressions are already of the form [11] expected of 
a closed topological string. One simply has to check 
that it is indeed that on the S? resolved conifold. 

In the language of the integer invariants nî, the S? 
resolved conifold is particularly simple. The only 
nonzero invariant is 7 — 1. Physically, this corre- 
sponds to a single brane wrapped on the genus-zero 
S^. Putting this into eqn [13], and making the 
expansion in powers of g., we find exactly eqn |9] 
for the genus-g contribution to the free energy. This 
is quite a remarkable agreement and represents a 
triumph for the ideas of large-N duality. 


Geometric Transitions and Large-N 
Duality 


To understand the reason for this duality a bit 
better, we utilize an old observation of Witten that 
Chern-Simons theory is an open topological string 
theory. As mentioned earlier, the expansion [2] (or 
[6]) is suggestive of an open-string expansion in 
terms of handles and holes. Witten observed that 
open topological strings on the noncompact 3-fold 
T*M (with Dirichlet boundary conditions on M for 
the end points of the string) is Chern-Simons theory 
on M. In fact, in the modern language of D-branes, 
we would say that U(N) Chern-Simons theory is the 
world-volume theory of N D-branes wrapped on M, 
for the topological A-model on T* M. 

In particular, Chern-Simons theory on $? is the 
theory of branes wrapped on S? inside T*'S?. The 
latter is the conifold geometry but now deformed by 
a nonzero size S?. It is described by the equation 


Xy — uu = p [16] 


where jz is the deformation which parametrizes the 
size of the $>. 

The above large-N duality can be considered as an 
open-closed string duality. Namely, that the theory 
of open A-model topological strings on the S? 
resolved conifold (with N D-branes) is dual to closed 
A-model topological strings on the $^ resolved 
conifold. Cast in this way, we see that the duality 
involves a transition in the background geometry in 
going from the open-string to the closed-string 
description. The sum over the holes changes the 
background. The $°, as it were, shrinks to zero size 
and a transverse $^ opens up. This geometric 
transition makes the connection between the 
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Chern-Simons theory and the closed topological string 
somewhat less mysterious. Maldacena's conjecture for 
super Yang-Mills involves a similar passage from 
D-branes in flat space to a closed-string theory on 
anti-de Sitter space. In fact, it appears as if the best way 
to understand 't Hooft's idea in generality is to think of 
it as an open-closed string duality. 


Further Checks and Consequences 


The free energy is not the only gauge-invariant 
observable in Chern-Simons theory. One important 
class of observables, which played an important role 
in the connection with knot invariants, are the 
Wilson loop expectation values. Given a knot K in 
S), we can define, in terms of an arbitrary 
representation R of U(N), the trace of the holonomy 
around the knot averaged with respect to the Chern- 
Simons path-integral measure: 


Wr(K) =< trp (r expi f A) > [17] 


P denotes path ordering. Similarly, we can also 
define the expectation values of links: products of 
traces of holonomies around various interlinked 
paths. The nonperturbative solution of Chern- 
Simons theory gives exact answers for the expecta- 
tion values of these Wilson loops. The discussion 
below is, however, confined to knots. 

Since the trace of holonomies is being considered 
in different representations, it makes sense to study 
the generating functional 


Z(U, V) = » _trr(U)trr(V) 
R 


— exp È - tr U"tr v [18] 


n—1 


The source V here is a U(M) matrix, unrelated to the 
U(N) holonomy U around X. The second equality in 
[18] follows from use of the Frobenius formula. It 
was shown by Ooguri and Vafa that this generating 
functional is the natural object from the point of 
view of the open-closed string duality. 

We have already mentioned that the U(N) Chern- 
Simons theory can be thought of as the theory of N 
topological D-branes wrapped on the Lagrangian S? 
cycle inside T*S*. For a knot K in the $$, we consider 
another Lagrangian 3-cycle Cy in T*S? which 
intersects the S? exactly in X. A canonical construc- 
tion for Cy is 


Ce = {(a(s),p) eTS|Ypá-0) [9 
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where the knot K is parametrized by the closed curve 
q(s). By construction, C intersects the $? in K. 
Now consider M D-branes wrapped on Cy. One now 
has to consider the fields coming from the strings 
stretching between the two sets of branes. One can 
show that integrating out these fields (which are in the 
bifundamental of the product group U(N) x U(M)) 
modifies the original Chern-Simons action to 


| — 
Sef(A) = Scs(A) + | : trU"tr V" [20] 
n=1 


Here V is the holonomy around K of the U(M) 
gauge field A. Thus, this configuration of M probe 
branes gives rise exactly to the generating function 
eqn [18] for Wilson loops of K. 

The geometric transition which relates the Chern- 
Simons theory to the closed-string theory now 
suggests what one needs to do to compute this 
generating function on the closed-string side. We 
have to follow the configuration of the M probe 
branes on Cx through the conifold transition in 
which the $° shrinks and one blows up the S?. It is 
not easy in general to figure out the Lagrangian 
cycle Cg which results from following Cx through 
the transition. It has only been done in a class of 
knots including the simple unknot. But assuming we 
know Cx, the generating function for Wilson loops 
is given by the free energy on the S^ resolved 
conifold in the presence of M probe branes on Cy. 
This requires one to know more than the closed- 
string partition function computed earlier. We now 
also need to compute amplitudes for world sheets 
with boundary on Cy. These are called open-string 
Gromov-Witten invariants and the study of this 
subject is in its infancy. For simple knots such as the 
unknot, for which C, is known, these can be 
computed. One finds again a remarkable agreement 
with the nonperturbative answers of Chern-Simons 
theory. Thus, the computation of knot invariants 
gets related to open-string Gromov-Witten invar- 
iants. There have been a number of other tests 
involving more general knots and links. One also 
has to be careful of subtleties such as in the choice 
of framing. The reader is referred to the articles 
by Marino (2002, 2004) for these topics. 


Conclusions 


The large-N duality of 't Hooft is realized in Chern- 
Simons theory in a very explicit way. Thanks to the 
analytic control we have over both Chern-Simons 
theory as well as closed topological strings, the 
conjecture passes very nontrivial checks that extend 
to all-genus case. This is more than we can do in the 


AdS/CFT conjecture where most computations are 
at tree level in the supergravity limit. In contrast, 
here we see the essential stringiness of the closed- 
string dual to Chern-Simons theory. 

Also, by viewing it as an open-closed string 
duality, many aspects of the correspondence were 
clarified. It, therefore, provides a useful toy model 
for a general understanding of open-closed string 
duality. Indeed, a proof of this duality using world 
sheet techniques has been proposed by Ooguri and 
Vafa. One would like to carry over some of the 
intuition that operates in this duality to the case of 
other physically interesting gauge theories. 

From the mathematical point of view, as already 
indicated, this duality leads to previously unsuspected 
relations between Gromov-Witten invariants and 
invariants of 3-manifolds, including those of knots. 
In fact, by considering more general geometric 
transitions and using this duality locally, one can 
learn about all-genus topological string amplitudes 
for a wide class of noncompact toric geometries. This 
line of development culminated in the formulation of 
the topological vertex by Aganagic, Klemm, Marino, 
and Vafa, which captures the essence of the 
topological closed-string amplitudes for noncompact 
toric geometries. As in the case of the general 
correspondence between the gauge theory and grav- 
ity, this duality sheds new light on both sides of the 
equation. We learn to see new integrality properties 
in knot and 3-manifold invariants which have an 
interpretation in terms of enumerative problems in 
3-folds. The surprises that such a deep connection 
presages have not yet been exhausted. 
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Models: Rigorous Results; Duality in Topological 
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Introduction 


Gopakumar and Vafa (1999) conjectured that U(N) 
Chern-Simons gauge theory on $° is dual, for large 
values of N, to a closed topological string theory 
on a suitable Calabi-Yau 3-fold X. They suggested 
that this duality is realized by a geometric "transi- 
tion,” a topological surgery which can be realized by 
birational contractions followed by the complex 
deformations of Calabi-Yau varieties. Here we will 
give some general comments on the history of this 
conjecture and then present some of its mathema- 
tical implications; we will focus on the geometric 
transition and the novel mathematics that it has 
generated. 

A duality relating gauge theories and string 
theories (with gravity) was first conjectured by 
't Hooft (1974). In 1998 Maldacena conjectured a 
duality between Yang-Mills gauge theory with 
N=4 SUSY on a four-dimensional manifold M 
and IIB type closed string on the anti-de Sitter space 
AdS? x $?. Chern-Simons string theory is a three- 
dimensional theory and purely topological, hence it 
is in principle simpler than four-dimensional Yang- 
Mills theory, which also involves a metric. 

In this survey, we discuss the IIA open/closed 
dualities: we will mostly be concerned with the partition 
function, that is we will be working in the context of 
“topological strings." The duality has been extended to 
a duality of strings, adding fluxes on the closed sector 
and branes on the open sector. There is much 
mathematical evidence supporting the conjecture. 


Overview 


The conjecture says that U(N) Chern-Simons gauge 
theory on $? is dual, for large values of N, to type 
IIA closed topological string theory on a suitable 
Calabi-Yau manifold X. A starting point for the 
geometry, and its mathematical implications, is that 
S? can be thought of as a vanishing cycle in a local 
Calabi-Yau manifold Y = T*S?, which deforms to a 
singular Calabi-Yau Yo; X is a Calabi-Yau bira- 
tional resolution of Yo. X are Y are related by a 
geometric transition. In fact, Witten showed that 
quantum Chern-Simons theory on $^ can be thought 
of as open IIA (with U(N) branes) on Y = T* S5; thus, 
a more general conjecture says, loosely speaking, 


that open IIA theory on a Calabi-Yau manifold Y is 
dual, for large N, to closed IIA on a Calabi-Yau X 
which is related to Y via a geometric transition. A 
consequence of a physics *duality" is a matching of 
the free energies of the dual theories. In this 
particular case, if the conjecture is true, the Chern- 
Simons free energy Z(S^,U(N)) should determine, 
and be determined by, the closed prepotential 
FAa(X,t). Note that Z(S°,U(N)) is purely topologi- 
cal, and that F(X, t) includes all genera, as we will 
discuss later. A mathematical application is comput- 
ing Gromov- Witten invariants for higher genus via 
large-N dualities (Marino 2004). Another conse- 
quence involves the matching of the observable in S? 
and X. 

This conjecture is now supported by a vast 
amount of evidence. Vafa, Gopakumar and Ooguri 
noted, via a string-theory analysis, that topological 
and knot invariants of S? (computed through U(N) 
Chern-Simons theory on S?) determine and are 
determined by, for large N, the Gromov-Witten 
invariants of X in a neighborhood of the exceptional 
locus of the birational contraction X — Yo. 

The extension to the full string theory would say 
that open string of type IHA compactified on a 
Calabi-Yau manifold Y with branes is conjectured 
to be dual to closed string of type IIA compactified 
on a Calabi-Yau manifold X with fluxes, if X and Y 
are related by a geometric transition. 

A mathematical consequence of this statement 
is that the closed Gromov- Witten invariants of X 
agree, with a suitable identification of the para- 
meters, with combinations of open Gromov- Witten 
invariants and knot invariants of Y. This has been 
shown to hold for some classes of examples. 

This circle of ideas has stimulated much work in 
physics and mathematics on the nature of the 
mathematical correspondence behind this duality, as 
well as the property of the enumerative and topo- 
logical invariants involved. The *mirrors" of the above 
transitions have been studied in a series of papers, 
starting with the work of Dijkgraaf and Vafa (2002). 

The mathematics behind the open/closed dualities 
is still not understood: it is reasonable to speculate 
that the natural setup is a framework of symplectic 
field theory. 

We shall start by discussing the principal topics 
of this large-N duality: Chern-Simons quantum field 
theory, HA closed prepotential (and Gromov- Witten 
invariants), and Chern-Simons as open string (and 
IIA open prepotential). Next we shall study the 
geometric transitions and conclude with some 
mathematical predictions of the duality. 
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We shall not discuss some other interesting 
implications of this duality. For example, we shall 
not discuss its mirror IIB duality: it is known that 
the part of the closed prepotential in HA correspond- 
ing to rational curves can be expressed as its [IB 
mirror dual with periods over certain suitable cycles; 
the ITA open contribution corresponding to open 
discs is expressed in terms of integrals over chains 
and the Abel-]acobi map. We only remark that this 
large-N duality has also been interpreted as a duality 
between seven-dimensional manifolds with G> 
holonomy. 


History 


The chronology of various important contributions 
in the field of large-N duality is as follows: 


e 1976: 't Hooft's conjecture 

e 1988: Clemens introduces transitions 

e 1988: Witten introduces quantum Chern-Simons 
theory on 3-manifolds 

e 1992: Witten discusses Chern-Simons theory as 
open string 

e 1998: Gopakumar-Vafa-Ooguri 

e 2001: Verification for unknot, Katz-Liu, Li, and 
Song 

e 2001: Lift to manifolds with G holonomy 

e 2002: The conjecture verified for many examples 
of conifold transitions, including compact case; 
the topological vertex is introduced 

e 2003: Relations with Donaldson- Thomas invariants 


Background 


The varieties of interest in the physical theory 
must satisfy certain “supersymmetry” conditions; in 
particular, a complex algebraic manifold is required 
to be Calabi-Yau, a real seven-dimensional Rieman- 
nian manifold is required to have G2 holonomy 
group. Also of particular interest are the Lagrangian 
real submanifolds of the Calabi-Yau 3-folds. By a 
Calabi-Yau manifold X we mean a manifold with 
c(X)20,59(05) 20, where O^ is the sheaf of 
holomorphic k-forms, and 0< k< dim(X). If 
dim X >2, we also assume that X is simply 
connected, but not necessarily compact. For exam- 
ple, if dim (X) — 1, X is a torus, if dim(X) — 2, X is 
a K3 surface, if dim(X) » 3, X is simply called a 
Calabi-Yau manifold. A compact Kahler manifold 
(M,g,]) of complex dimension m > 3 is a Calabi- 
Yau variety if and only if its holonomy is SU(m). A 
subvariety L of a symplectic manifold (X,w) is 
Lagrangian if w,=0 and dimL-(1/2)dim X. 
Sometimes we consider noncompact manifolds, 


thought of as neighborhoods of a compact projective 
Calabi-Yau manifold. Typically, our symplectic 
manifold is a Calabi-Yau 3-fold (X,w) together 
with its Kahler form w. If there exists an antiholo- 
morphic involution, then the fixed locus is a 
Lagrangian submanifold. 


The Dualities 


We will take the point of view that dualities in 
physics imply relations between geometric invari- 
ants, without dwelling on the physics of the dualities 
themselves. A consequence of a physics “duality” is 
the matching of the prepotential of two dual string 
theories. 


A Few Comments on Chern-Simons Theory: Free 
Energy (Partition Function) 


Let L be a closed oriented manifold together with a 
principal G-bundle. The classical Chern-Simons 
action is defined as S(L, A) — f, o(A), where a is a 
3-form on L which depends on a connection A and a 
suitable bilinear invariant form on the Lie algebra q. 
It is well defined under gauge transformations 
modulo the integers; e?79 7^) is well defined. In 
the large-N dualities considered here, the groups of 
interests are SU(N) and U(N). The first check of the 
duality was found with G — SU(N) and M = $è; later 
it was discovered that the correct group for the 
matching of the observables must be U(N), while 
both can be used for the free energies. We shall 
consider G — SU(N) and M—S?. Without loss of 
generality, the bundle can be taken to be the product 
U(N) x S^; any bilinear invariant form on the Lie 
algebra $u(N) is necessarily an integer multiple k of 
the Cartan-Killing form on the Lie algebra. Then 
S — S(k, A) and 


Stk, A= — 


3 | tr(AAdA+$AA^AAAA) 
87^ JS 1 

where k is the “level” of the theory. Witten defines 
the quantum Chern-Simons theory by taking the 
integral of the Chern-Simons action over all possible 
connections A modulo gauge equivalence G: 


Z(S?.SU(N)) 
=| (D.A)e??*(A) 
J A/G 


i bi | 
=| (DAyexr(—7- | tr(AAdA+3ANAAA)) 
J A/G Ax. $3 : 
Witten shows how to calculate the free energy 
Z(S?,SU(N)) through topological surgery, assuming 
Z(S?xS')—1. Witten also defines the partition 


function of knots and links in L (the "expectation 
values"), which are knot and link invariants. The 
expectation values are computed by evaluating the 
trace of the holonomy transformation of a U(N) 
connection around the knot, and then taking a 
suitable average of the U(N) connections. These 
invariants depend on a choice of the framing of the 
knot (or link). 

The explicit computations involve physics, repre- 
sentation theory, and topology. If L= S$, then: 


Z(S3, SU(N)) =(k +N) un EE 


N-1 in N=} 
X I {2sin(; z x)! 


Reshetikin and Turaev, among others, described 
mathematically the Chern-Simons free energy and 
the expectation values. 


A Few Comments on Closed-String Theory: Free 
Energy (Prepotential) 


In IIA closed-string theory on X, a Calabi-Yau 
manifold, one considers holomorphic stable maps 
of closed Riemann surfaces of genus g, 0: X, — X, 
with ó,(X,)—[8] € H2(X, Z), for all genera g and 
homology classes 3 € H5(X, Z). 

Then one forms the closed prepotential Fal(X, t), 
which encodes the enumerative invariants of 
such maps to X, and which depends on the 
Kahler parameters t of X. Sometimes the prepoten- 
tial is also called “free energy" in the physics 
literature or Gromov-Witten prepotential, as it 
contains the Gromov-Witten invariants of X. 
Setting ,(q) = 3 5c nx z) Cu ad^, the closed pre- 
potential is defined as 


FatX.q)-- S lae rg) 
2-0 


| j ha 8.8 
Here q is a formal variable such that q^ ^^ =q} -q5 


(for 3), 32 € H5(X, Z)) and g; is the string coupling 
constant. C, ; are the genus g Gromov-Witten 
invariants of X, corresponding to the class 3 and 
they have been detined as 


Ca, —_ [A 1 
[Mr (XA 


It is difficult to explicitly compute the invariants 
C,3; in particular, there is no known general 
method for calculating these invariants. They are 
computed mostly via “localization” methods, in the 
presence of a suitable torus action. In the case of 
g —0 the invariants are often computed via IIA-IIB 
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duality, calculating certain periods in the mirror 
manifold W. 


Example (Faber-Pandharipande). Let X = Op;(—1) 
POp;(—1); X is a neighborhood of a rigid 
rational curve, which can be thought of as a local 
Calabi-Yau manifold; then all the effective curves 
8€ H3(X,Z) must be of the form g—d[P'], VdEN. 
Faber and Pandharipande showed that 


X d 


Fa(X%qQ=>_ i 


— — | 
“~ 2 sin(dg,/2) [1] 


This formula was proved with localization methods 
after it was conjectured by Gopakumar and Vafa using 
large-N dualities. In fact, a consequence of a duality 
between two theories is the matching of the free energies 
of two dual string theories. In this particular case, the 
conjectures imply that Chern-Simons free energy 
determines, and is determined by, the all-genus closed 
prepotential of a suitable Calabi- Yau manifold X: 


Z(S°,U(N)) = Fa(X,t) 


Note that the left-hand side is purely topological, 
as we saw in the previous section, while the right- 
hand side is holomorphic. 

The trait d'union between the two prepotentials is 
given by the interpretation of Chern-Simons theory 
on $? as open-string theory on T*S? and the 
geometric transition. 


A Few Comments on Open-String Theory 
with Branes: Open Prepotential 


Let Y be a Calabi-Yau manifold together with [UL;], 
Lagrangian submanifolds; to each submanifold 
L; is assigned a gauge group G;:L; is wrapped 
with G;-branes. Here we shall focus on the case 
G; —U(N;) and we will write (Y; L;, U(N;)). 

Witten shows that the open  prepotential 
F op( Y, A, tops gs) depends on "t Hooft’s coupling con- 
stants À; associated to Chern-Simons theory on the 
Lagrangian submanifolds (L;,U(N;)), together with 
the open Kahler parameters top € H2(X; U Li, Z), and 
the string coupling constant g,. To describe the open 
prepotential, Witten argues, we consider all maps 
of Riemann surfaces with boundary to Y, with 
the condition that the boundaries are mapped to the 
Lagrangian submanitolds L;; one should also include 
all the “highly degenerate holomorphic maps," in 
particular those which contract X,, to a “ribbon 
graph” on the Lagrangian UL;. The contribution of 
these highly degenerate maps is captured by the 
quantum Chern-Simons theory of the Lagrangians 
(Lis U(N;)}- 
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Application 1 (Chern-Simons free energy as open 
prepotential). Let us consider open HA on 
Y — T*S? with U(N)-branes wrapped on L=S?: 
L is a Lagrangian submanifold with the standard 
symplectic structure; note that in T*L there are no 
nontrivial homology curves. Then, according to 
Witten, the corresponding open  prepotential 
Fop(Y, UL;) must only depend on the “highly 
degenerate" maps and must consist of the Chern- 
Simons term Fes on L= $?. In particular, 


Fos = log ZS”) = Fol YA, Bs) 


where A—2N/(k +N) is the "t Hooft coupling 
constant. Periwal (1993) showed that, for large N, 
log Z(S?) could be expanded as a closed-string 
expansion: 


Fes(A) = 9 Fe(A)ge 78 


g>0 


where g,=:27/(R+.N) is the Chern-Simons cou- 
pling constant. In 1998 Gopakumar and Vafa, using 
physics arguments, deduced that the expansion 
would have the closed form [1], which was later 
proved by Faber and Pandharipande. 


The explicit description of the open prepotential 
in the presence of homology classes is not known; 
one would need to combine the enumerative 
invariants of open maps together with the quantum 
Chern-Simons factor. We shall discuss an approach 
at the end of this note, but consider first the 
geometric transition. 


The Transition 


The conjecture says that U(N) Chern-Simons gauge 
theory on $? is dual, for large values of N, to IIA 
closed topological string theory on a suitable 
Calabi-Yau manifold X. A starting point to find 
such X is that $? is a Lagrangian 3-cycle in the 
manifold Y = T*S’; performing a topological surgery 
by replacing $? with S% one obtains a (local) Calabi- 
Yau manifold X, on which the dual IIA theory is 
compactified. The key observation is that Y can be 
identified with the algebraic variety of equation 
(xy —zw=t}c C^ and that this is a complex 
smoothing (in fact the Milnor fiber) of Yo with 
equation {xy — zw —0] C C^. On the other hand, X 
is a small resolution of this singularity, where P! is 
the exceptional locus of the birational contraction. 
The origin is an *ordinary double point" singularity 
and the nontrivial sphere S? C Y is the vanishing 
cycle of the degeneration. The manifolds involved 
are noncompact: the exceptional curve [P'|— t is 
the only nontrivial homology class in X, and the 


enumerative invariants in X can be thought as the 
contribution of the exceptional curve in a neighbor- 
hood of a Calabi-Yau manifold. We shall present 
the steps leading to this construction and the 
evidence for the conjecture. 


The Local Construction of X 
Let Yy = (ntis «= «5 4) € C* such that we w? zu. 


Proposition 1 Let jj be a nonzero real positive 
parameter; then: 


eL—S C T*S? is a Lagrangian submanifold of 
T*S? with its standard symplectic structure; 
e T*S & Y, and L £ L,, def (Re(555. | w? =i bi 


In fact, we can embed T*S? in R? as 


4 4 
» 4-1 > dij 
E] j=] 
where $?—(p;—0]; consider then the morphism 
C^ — R? defined by setting 
Re(tw;) 


f=, 
yf Met Da UF 


which induces the diffeomorphism Y,, = T*S? of the 
statement. 


Remark 1 Let Yo = (37. w;-0)c C^: then: 


pj = Im(w;) 


e Y, is singular at the origin, 
e Y, is a complex deformation of Yo, and 
e L, is called a “vanishing cycle." 


With a change of coordinates we can write the 
equation of Y, as [xy —zw — 0]; the singularity is 
still at the origin. This singularity is an ordinary 
double point, which is often referred in physics 
literature as “the  conifold singularity.” Let 
X C C* x P! be defined: 


AZ + vy = 0, Ax + vw = 0 


[A,v] EP’. 
Remark 2 X is smooth and the morphism 


$:X—Yo, ((x,y,z,w), [A n]) > (x,y,z, w) 


is an isomorphism Ø : (XX P!) ^ (YoM0]) and 
p! (0,0,0,0,) C C^. à is a small (nondivisorial) 
birational resolution of the singularity at the origin. 
Y, is a deformation (smoothing) of Yo. Note that 
topologically S? = L, C Y, has been replaced by 
p! ~ S? c X. The algebraic properties of the topo- 
logical surgery between Y, and X were first studied 
by Clemens in 1988. 


Transitions in Geometry 


A transition between X and Y is a birational 
contraction from a smooth Calabi-Yau X to a 
singular variety Yo followed by a complex deforma- 
tion to another smooth Calabi-Yau manifold Y: 


X 


| 
Y e Yo 


The vanishing cycles of the complex deformation 
UL; are always Lagrangian submanifolds of Y. The 
transition makes sense if dim (X) 2 dim (Y) > 2 and 
it is nontrivial if dim(X) — dim(Y) » 3, when the 
topology of X is different from the topology of Y. 
The possible transitions among Calabi-Yau 3-folds 
have been classified. 


Conjecture 1 Let X and Y be Calabi-Yau mani- 
folds related by a geometric transition: then IIA 
open theory with U(U) branes compactified on 
(Y, UL;) is dual to IIA closed theory compactified 
on X (with fluxes). 


As a consequence: 


Conjecture 2 Let X and Y be Calabi-Yau mani- 
folds related by a geometric transition: then 
Fopl YA, gs, top) — J4(X, 9,85) for a suitable identi- 
fication of the parameters. 


The results stated in the previous section can be 
summarized in the the following statement, which is 
the proof of the above conjecture for the special case 
of a local conifold transition: 


Theorem 1 Let X-zOyp(-1)950O0pw(-1) and 
Y—T'S? with U(N) branes wrapped on L-—S;. 
Then X and Y are related by a conifold transition 
and log Fcs(S?) & £S4(Y, 3) = Fa(X, 4), with tbe 
identification 


2Nx | 2m 


ASEN 4 


This matching of the free energies is supporting 
evidence for the large-N conjecture. At this moment, 
we still do not know if Conjectures 1 and 2 hold for 
more general transitions. 


A Few Comments on Knots and Links 


Later, Ooguri and Vafa extended the conjecture to 
the observables, that is, by adding knots and links in 
S°; the guiding principle is that a knot (or link) C C S? 
should determine a noncompact Lagrangian sub- 
manifold £e C X; it is conjectured that the knot 
(and link) invariants, expressed as expectation 
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values, should determine and be determined by the 
enumerative invariants of morphisms of bounded 
Riemann surfaces, with boundaries mapped onto 
Lc. We refer to these invariants as open Gromov- 
Witten invariants. While both statements have been 
verified with mathematical techniques only when C 
is the unknot, there is much supporting evidence for 
the conjecture in general. We will not describe these 
aspects here but only make a few remarks. 

The expectation values of a knot C are computed 
by taking first the trace of a holonomy matrix of a 
U(N) connection A along C and then integrating over 
all connections (modulo gauge equivalence). As for 
the case of the Chern-Simons free energy, the 
definition of expectation values has been worked 
out both in the realm of physics and of mathematics. 
The expectation values are knot and link invariants, 
and depend on a choice of the framing of the knot (or 
link). The open Gromov- Witten invariants have not 
yet been constructed, as we shall discuss in the 
following section; however, starting with the work of 
Katz and Liu, Li and Song open invariants have been 
successfully calculated in the presence of a torus 
action. The resulting invariants do depend on the 
choice of the torus action, which has been shown to 
match the choice of the framing of the knot (or link). 


More on the Open Prepotential 


The open Gromov- Witten invariants, in analogy with 
the closed case, should *count" in an appropriate sense 
open morphisms; at this point, it is not known how to 
define this quantity. To proceed in analogy with the 
closed case, one would need to define the appropriate 
moduli space of open maps and its virtual fundamental 
class. On the other hand, open invariants have been 
successfully calculated in the presence of a torus 
action, assuming the existence of the moduli and 
virtual fundamental class and that the Atiyah-Bott 
localization theorems can be applied. We shall follow 
this approach in sketching how the IIA prepotential 
has been computed in many examples. 


Open Invariants 


Let [8] € H2(Y; UL; Z) be the relative homology 
class of Riemann surfaces in Y with boundary on the 
union of the Lagrangian 3-cycles U;L; and a class 
[y] € Hi(UL;). 

If ©, is a Riemann surface of genus g and h 
boundary components, let o: X, p — Y be a morph- 
ism with 


Pap) = [8| € Fo( Y; UL;, Z) 
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The open generating function is 


Y gos ^*^ p plt (top) 


g.b»0 


F(Y UL; top, 8s) = 


with 
F, p(t p d y? 


Here q and y are formal variables such that 
q?* =q” , q” and y”i+hz =y”. y^^, for By, Bh € 
HxY;UL;,Z)m,y€Hi(UL;,Z); top is the open 
Kahler parameter, g, is the string coupling constant 
and C, ,, ;,., should “count” in an appropriate sense 


the maps ó. 


Example (Ooguri-Vafa; Katz-Liu; Li-Song). If 
Y = Op, (- 1) 6 Op, (- 1), then t is the class of the P! = 
S2, t/2 represents the class of the lower hemisphere in 
S*. The Lagrangian L is the Lagrangian £ in the 
previous sections, which corresponds to the unknot in 
S? C Y; it is the fixed locus of an antiholomorphic 
involution on X and it intersects $^ in an equator. 
Then, for a suitable choice of the torus action: 


F(Y, UL;, top, Es) -dt/2 


23 a EE 
7 2dsin(dA/2) 
There is a complete form for more general torus 
actions. The above formula was first computed by 
Ooguri and Vafa, using string-theory arguments, 
and then computed by the mathematicians, Katz and 

Liu, and Li and Song. 


More on the Open IIA Prepotential 


If there is only one rigid open curve in Y, say a disk 
C, with boundary on Lc Y, then, as Witten 
showed, the open prepotential is a combination of 
the open enumerative invariants as described above 
with 5 — d[C] and 4 — 9C and the expectation values 
of the unknot OC. The variable Y is changed in the 
trace of the holonomy of a connection. 

In the presence of a torus action, one can treat the 
fixed locus as if it were rigid and proceed accordingly. 


With these techniques, Conjecture 2 has been 
verified for many cases of conifold transitions, with 
top nontrivial, for a suitable identifications of the 
parameters, including when both X and Y are 
compact manifolds (Diaconescu-Florea 2003). 


See also: AdS/CFT Correspondence; Chern-Simons 
Models: Rigorous Results; Large-N and Topological 
Strings; Mirror Symmetry: A Geometric Survey; String 
Field Theory. 
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Introduction 


As a prototype of lattice gauge theory, quantum 
chromodynamics (QCD) will be considered in this 
article. All statements about QCD can easily be 
extended to other theories, with different gauge 
group and different content of particles. 

QCD is a gauge theory with gauge group SU(3) 
(color group), coupled to spin-1/2 particles (quarks) 
belonging to the fundamental representation of the 
color group. There exist in Nature six different 
species (flavors) of quarks, with masses ranging 
from zy ~ 5 MeV to Miop ~ 180 GeV: the values of 
these masses are determined by other interactions 
and can be treated as input parameters of the theory 
as well as the number of quark flavors. In standard 
notation, the Lagrangian reads 


1 ! xg ! 
Ex -3t(6, Gy) Ez 2. (p — my [1] 
Í 


The sum runs over the six quark flavors f. 
Gv = 0,AÀ, — OVA, + ig[ Ay, A,] is the field strength 
tensor, A,— 5», T"A7 the (gluon) gauge field, 
T^(a—1,...,8) are the eight generators of the 
gauge group in the fundamental representation, 
normalized as tr(T?T’)=(1/2)6%. vy is a color 
triplet of fields. Under a gauge transformation U(x), 


Up (x) — prx) = U(x)vr(x) [2] 


Ay(x) > Aj(x) 
= U(x)A,U'(x) + iU(x)8, U' (x) [3] 


D,, is the covariant derivative of v 


and transforms like (y by construction. 

L is invariant under the gauge transformation 
equations [2] and [3]. As a consequence of gauge 
invariance, the theory has one single coupling 
constant g. 

To make connection with the observations, one 
has to solve the theory, that is, one has to construct 
a Hilbert space on which the fields act as operators 
obeying the equations of motion and the canonical 
commutation relations. In textbook field theory, 
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this is done by splitting the Lagrangian L into two 
parts: 


L= Lot Li [5] 


with Lo the part of L which is bilinear in the fields 
and Lı the rest. Lo can be solved exactly since it 
describes free particles and the corresponding 
equations of motion are linear. The resulting Hilbert 
space is the Fock space of free particles. Ly is treated 
as a perturbation producing scattering between the 
fundamental particles. This approach works well 
in quantum electrodynamics, where the observed 
particles (electrons and photons) coincide with the 
excitations of the fundamental fields of the 
Lagrangian. 

In QCD, the fundamental excitations (the quarks 
and the gluons) are observed as particles neither in 
Nature nor as a product of high-energy collisions 
between elementary particles. This feature is known 
as confinement of color. The conjecture is that 
excitations with nontrivial color are forbidden to 
propagate as free particles. However, if hadrons are 
probed at short distances by photons or by leptons, 
everything works as if they were composite states of 
quarks. The accepted explanation relies on asymp- 
totic freedom: the effective coupling constant 
becomes small at short distances (high momentum 
transfers) and the constituents behave as free 
particles. 

At large distances, the fundamental excitations are 
not observed, the interaction is strong and the 
perturbative picture describing scattering between 
quarks and gluons is not adequate for the real 
world. 

An alternative quantization procedure is needed 
which does not rely on perturbation theory. A 
formally exact quantization procedure is the Feynman 
path integral. The solution of the theory is given in 
terms of a functional integral Z[/], which generates 
the correlators of the fields in the ground state 
(vacuum). Indicating symbolically the Lagrangian 
coordinates, namely the fields, by a single symbol ®, 
one has 


ZiN\= f T] i-stel - f cows | i6 


The connected Euclidean vacuum correlators are 
given in terms of functional derivatives of Z|/] 


< O/T (P(x; )®(x2) -— $(x,))]0 > conn 
__! & ziJ] 


[0] 8] (x JO] (x2) --- 6) (xn) 7 


J(x)=0 
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"Euclidean" means that they are analytic con- 
tinuations to imaginary times. Going to Euclidean 
system is necessary to isolate the vacuum state. The 
amplitudes can be analytically continued back to 
Minkowski space. The Hilbert space and all the 
physical observables can be constructed in terms of 
the correlators, a property known as reconstruction 
theorem. Formally (i.e., assuming that everything 
makes sense only if the functional integral exists), 


< O|T (®(x1)P(x2) TES $(x,))]0 > conn 


= z | Wee 


x exp(—S[])O(x1)P(x2)---P(xn) 8] 

The continuation to imaginary time changes sign to 
the kinetic energy, and Z formally becomes the partition 
function of a four-dimensional statistical model with 
Hamiltonian Sy [6], a general fact in Feynman integrals. 

By definition of functional integral, Z is defined 
by discretizing a finite volume V of spacetime to a 
finite set of points and then sending their number to 
infinity, making a set dense in V. If the limit exists, a 
Zy is obtained. The volume V is then sent to infinity, 
to cover the whole spacetime (thermodynamical 
limit) and Zy eventually converges to Z. A rigorous 
proof of the existence of these limits does not exist 
for QCD, but there are qualitative arguments that 
this is the case, which will be presented below. 

In the lattice formulation of field theory, a regular 
lattice, usually cubic, is taken as a discretization of 
spacetime. 

From the very definition of Feynman integral, it 
follows that the formulation of field theory on the 
lattice is nothing but an approximation to the limit 
which defines Z. It will provide a good approxima- 
tion if the lattice spacing is small enough with 
respect to the physical lengths involved and if the 
lattice is large compared to them. 

Perturbation theory amounts to split the action 
into a bilinear term Sọ and an interaction term $i 
containing the higher powers of the fields. The Z 
integral is then computed by expanding the weight 
in a power series of Sı: 


BICIS exp(—So — Sr) 
= [ T1402) exp(—s0) 9? [9] 


n 


The Feynman integral thus becomes Gaussian, can 
be computed, and gives the usual perturbative 
expansion. The two limits (integral and series 
expansion) do not commute in general. For QCD, 


there are indeed arguments that the renormalized 
perturbative expansion does not converge and is 
plagued by singularities known as renormalons. 


Wilson's Formulation 


For field theories of scalar particles, the lattice 
discretization is performed by assigning a value of 
the field to each site of the lattice. The Wilson 
formulation for gauge theories is not made in terms of 
the fields A,,, which are defined in the Lie algebra of 
the gauge group, but in terms of parallel transports, 
which are elements of the group itself. The building 
blocks are parallel transports along links parallel to 
spacetime axes connecting neighboring sites 


U(x) 


B 
= P exp € / Ad?) æ exp(igaA,(x)) [10] 
SX 
where ji indicates the vector of length a in the y 
direction and P the ordered product. The last 
approximate equality is valid in the limit of small 
lattice spacing a. g is the coupling constant. 
Under a gauge transformation V(x), 


U, (x) — V (x)U, (x) V! (x + ii) [11] 


It follows from eqn [11] that the parallel transport 
along a closed path is gauge invariant. The density 
of action can be written in terms of the parallel 
transport along the elementary square of links in the 
hyperplanes jv Iv, known as plaquette: 


[| 5t[U,()U (Gc + AU + 2)UT(X)] — [12] 


ji 


By expanding in powers of a, one easily finds 


| 
li = Ne — 54 trlG, Gy] + O(a*) 


jit 


ge 
— 
ew) 

| 


with N, the number of colors, 3 for QCD. The 
lattice action can be defined as 


1 
5 —= (i = xc) [14] 


xv 


with 9 — 2N./g^, and tends to the continuum action 
as a — 0, O(a’). An infinite number of higher-order 
terms in 4 exist, which come from the expansion of 
the links, but they are expected to be irrelevant in 
the continuum limit a — 0. 

The measure of the Feynman integral is assumed 
to be the Haar measure of the gauge group for each 
link, which again can be shown to tend to the 
continuum measure in the continuum limit. 


Everything is gauge invariant, contrary to the 
perturbative formulation, where a gauge fixing is 
required to define the vector meson propagator. 

By Weierstrass theorem, the integral is finite for any 
finite number of links, the gauge group being compact. 

Any other choice of the lattice action differing from 
the Wilson action of eqn [14] by terms of higher order in 
a will have the same continuum limit: there is significant 
freedom in the choice of the action. 

In the language of statistical mechanics, the 
Euclidean lattice formulation is a spin model. 
Different choices of the action correspond to different 
spin models. In the vicinity of a second-order phase 
transition, however, the correlation length becomes 
large with respect to the lattice spacing and all the 
irrelevant terms become negligible. All the spin 
models at the critical point belong to the same 
universality class and define the same field theory. 

This is what happens for QCD because of 
asymptotic freedom. By renormalization group 
arguments, the lattice spacing behaves as 


a(B) e xexp( - bj) [15] 


at sufficiently large 9, where —bo is the coefficient of 
lowest-order term of the 3-function, bo is positive and A 
is a physical scale. As 8 — oc, a tends exponentially to 
zero in physical units and the coarse structure of the 
lattice becomes unimportant, indicating that the short- 
distance limit in the definition of the Feynman integral 
exists. The theory also develops a mass scale A which 
insures the existence of a finite correlation length and 
hence of the thermodynamical limit. In practice, when 8 
is increased, the lattice space becomes exponentially 
small in physical units. As a consequence, however, the 
physical scale becomes exponentially large in lattice 
units, and an exponentially large lattice is needed to 
insure the large-distance convergence. This makes life 
difficult if the Feynman integral has to be computed 
numerically. 


Quarks 


Fermion fields are defined on lattice sites. The 
naive lattice transcription. of the fermion term 
in eqn [1] consists in replacing the covariant 
derivatives by finite differences with parallel 
transports to make the result gauge covariant. In 
principle, D (x) = U' (x)w(x + pi) — wlx) is a correct 
definition. In practice, a more symmetric difference 
is used which is correct O(a7), namely 


aM 
D vx) 


- ; [U(x) h(x + à) - U'(x — A)v(x — &)] [16] 
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The fermionic Lagrangian then reads 
» = W(x) [iP — m]v(x) 


= X dalx)M ajay Val’) [17] 
j 


x.x'ai 


It is convenient to indicate this expression in the 
form Sp — M v, where 7 is a large column whose 
elements are labeled by the site x and by the 
component a. The functional integral over w can 
explicitly be done by using the standard rules of 
integration on Grassman variables, since the action 
is bilinear, 


Z= | [audeat 
x exp(—Sg[U] — e; My) [18] 


The result is 


Z= { IIau.c exp(-Si[U]) det M — [19 


The effect of fermions is to multiply the weight by a 
functional determinant which depends on the gauge 
field configuration. 

A problem exists, however, in this procedure 
already at the level of free fermions, that is, putting 
U=1 in the action and in the determinant of 
eqn [18]. The equation of motion reads, in Fourier 
transform, 


Hk s sin (2$) m m us(k) =0 20] 


H 


With respect to the continuum, the momentum 
p,-— 27k,/L has been replaced by its sinus. At 
small values of p,, eqn [20] coincides with the 
Dirac equation. However, an alternative solution 
exists at p, © 7, for each yz independently. The new 
equation differs from the other by a change of sign 
of ya- Changing sign of one of the gammas means 
changing sign to y —4!4?4?43 which is the 
chirality of the fermion. Instead of one fermion, 
we then have 2*=16 fermion species, organized in 
pairs with opposite chiralities. It is impossible to 
have a single fermion with a given chirality. A 
number of recipes have been proposed to circum- 
vent this artifact of the lattice regulation, for 
example, introduce by hand a term in the action 
which removes the spurious particles in the limit of 
zero lattice spacing (Wilson’s fermions); double the 
lattice spacing by constructing two sublattices on 
even and odd sites, respectively, which propagate 
fermions of opposite chirality (staggered fermions), 
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so that the argument of the sinus in the derivative is 
doubled. More recently, an idea which goes back to 
Ginsparg and Wilson has been implemented, which 
consists in replacing a strictly local equation of 
motion like eqn [20] by an equation with the 
same continuum limit which is nonlocal, but with a 
nonlocality falling off exponentially at large 
distances, a recipe which makes propagation of 
chiral fermions possible. This is an important 
improvement, even if very demanding in computer 
power. 


Numerical Simulations 


Solving analytically the lattice version of QCD 
would allow one to follow constructively all the 
steps which bring to the definition of Z, that is, the 
ultraviolet and the infrared limit, as explained 
earlier. Presently that is out of reach. Also an 
attempt by Wilson to solve the lattice renormaliza- 
tion group equations by techniques of decimation is 
not conclusive. 

The problem can be attacked numerically. One 
way would be to compute the integral numerically. 
That is, however, prohibitive: it would be like 
solving exactly the equations of motion for the 
molecules of a gas. The lattice theory is in fact a 
four-dimensional statistical mechanics with the 
Boltzmann factor $=2N./g* and Hamiltonian 
equal to the Euclidean action. As in statistical 
mechanics the way out is to create a significant 
sample of configurations with weight exp (— Sg) 
and to determine the field correlators which describe 
physics by an average on this ensemble. This is done 
by Monte Carlo techniques. 

The basic principle is to start from an arbit- 
rary field configuration and make a sequence of 
random changes, normally on a single link at a 
time, with uniform probability in the group 
measure so as to converge toward the equilibrium 
distribution exp(—0Sgr). For that purpose, the 
probability Pec to change from a configuration C 
to another C' is constrained to obey the detailed 
balance relation 

Poc exp( —8S[C]) = Pee exp(—8S[C]) [21] 
A common algorithm is known as metropolis. The 
way to implement the condition (eqn [21]) is to accept 
the new trial configuration C' if S[C'] € S[C], and to 
accept it with probability exp (— 8[S(C') — S(C)]) if 
S[C'] > S[C]. An alternative method is known as 
“heat-bath”. If the probability of the configuration for 
one link at a fixed value of the other variables is 


explicitly known, the change can be accepted with that 
probability. 

In the presence of dynamical quarks, the integral 
eqn [18] is converted into an integral on bosonic 
variables by inverting the matrix M: 


z= | [Jau ) déó(x) dd(x)! 


xexp(-Sg[U] - &[M'M] o) — 2] 
The property has been used such that 
[I[dé(x) dé'(x) exp —9![M! M] 'ġ)=|detM|. A 


metropolis updating is then performed on the 
combined U, and @ variables. To have a choice 
of the trial uniform in the measure, an algorithm is 
commonly used which is based on ergodicity, 
known as hybrid molecular dynamics. A fictitious 
conjugate momentum is associated with all 
variables, and a fictitious Hamiltonian is defined 
by adding to the action, considered as a potential 
energy, the sum of the squares of the conjugate 
momenta. A classical evolution is then performed in 
time by small steps which should displace the state 
in phase space ergodically: the evolution is called a 
trajectory. After a number of steps, a metropolis test 
is made as explained above. 

Typically, the computer time needed to produce a 
significant configuration is proportional to the 
volume V of the lattice for pure gauge systems, to 
V?/* in the hybrid algorithm for full QCD. 

As explained before, in order to have a good 
approximation to the Feynman integral the lattice 
spacing has to be small compared to the physical 
scales, for example, with respect to the Compton 
wavelength of the heaviest quark. On the other 
hand, to control volume effects it has to be large 
compared to the biggest physical length, for 
example, with respect to the Compton wavelength 
of the lightest quark. Since there is a factor 
Miop/Mup © 3 x 10? between these two lengths, the 
lattice size needed would be prohibitive from 
numerical point of view. In practice, lattices of 
size L^ are affordable with L < 64 — 128. For 
this reason, only the light quarks u, d, s are kept, 
which have mass smaller than the typical scale of 
the theory, which can be identified as the square 
root of the string tension. In the limit in which light 
quark masses are small compared to QCD scale, 
the Lagrangian is invariant under any unitary 
mixing of them. A global SU(3) invariance exists, 
which is known as flavor symmetry, and is broken 
by the difference of quark masses. Heavier 
quarks can be described by an effective theory, 
since they have negligible dynamical effects at low 
energies. 


A Selection of Physics Results 
String Tension 


A big excitement followed the first numerical 
calculations by M Creutz at the beginning of the 
1980s in which the static potential V(r) between a 
quark and an antiquark was computed in pure- 
gauge theory on the lattice. One way to measure it is 
to measure the correlator of two Polyakov lines at a 
distance r on a significant ensemble of field config- 
urations. The Polyakov line is the parallel transport 
in the fundamental representation along the time 
axis across the lattice: with periodic boundary 
conditions it is a closed loop, and hence it is gauge 
invariant. It can be proved that the log of this 
correlator is equal to — V(r)aL; with L; the extension 
of the lattice in the time direction. It was found that 


V(r) = or [23] 


The parameter o is known as string tension. A 
potential of the form eqn [23] means confinement: 
an infinite amount of energy is required to pull apart 
the particles at infinite distance. The parameter o 
can be determined phenomenologically from the 
mass spectrum of the mesons and o2; ~ 1 GeV. 
What is measured on the lattice is 


aa(3)*n* [24] 


where 7 is the distance of the two Polyakov lines in 
lattice spacings and a(8) the lattice spacing in 
physical units. In fact, the computer only produces 
pure numbers. If the lattice QCD belongs to the 
same universality class of QCD at the critical point, 
that is, if the lattice really defines QCD, the 
dependence of a(8) on jJ is dictated by the 
8-tunction of the renormalization group. At suffi- 
ciently large 9 — 6/g*, 


uli en deo T., [2.5] 


with bo — (11/3)N./1677?. Aqu is the energy scale of 
the theory. The measurement of the potential gives 
indeed a dependence of the lattice spacing on 8 
consistent with eqn [25] and allows one to deter- 
mine o/A;,,. The absolute value of the lattice 
spacing can be determined by comparison with the 
physical value of the string tension. The theory is 
able to produce a physical scale. The correlation 
length is finite and as a consequence the infrared 
limit of the Feynman integral exists. 


Mass Spectrum 


Any operator with the quantum numbers of a 
particle can be used as interpolating field for it. 
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The correlator of the operator at large distances 
behaves like a sum of exponentials exp (—mr) with 
m the masses of the particles with the same quantum 
numbers. At large distances the lightest particle 
dominates, especially if the operator has a good 
overlap, that is, if its matrix element between 
vacuum and the state of the particle is the biggest. 
From the correlators mr can be determined. On the 
lattice r=na(3) so that, by eqn [25] what is really 
determined is the ratio ;/Aj4. If Ajay has been 
determined, for example, from the string tension, 
the mass of the particle results in physical units. 
Alternatively, the ratios of any two masses can be 
determined and the scale fixed by the value of one of 
them. A good agreement is obtained already in pure 
gauge (quenched approximation) indicating that the 
quark loops are relevant at the level of 10% 
typically. This fact supports the idea that the large 
N.-limit is a good approximation to reality, quark 
loops being nonleading in that limit. The light 
particle masses are more difficult to compute, 
being sensitive to the masses of light quarks which 
cannot be taken at realistic values due to computa- 
tional difficulties: large lattices are required and big 
fluctuations are present near the chiral point. The 
spectrum of particles made of heavy quarks can be 
computed using effective theories, and nicely fits 
experiment. A byproduct is a precise determination 
of the gauge coupling constant, competitive with 
phenomenological determinations from short dis- 
tance perturbative QCD. 


Weak Interaction Matrix Elements 


There exist matrix elements of currents (or products 
thereof) entering in weak amplitudes which involve 
large distances and are not computable in perturba- 
tion theory. Lattice can be used to evaluate them. 
Renormalization problems can appear in this 
approach when the cutoff is removed, which, 
however, are not difficulties of principle but only 
of technical nature. This activity is of fundamental 
importance to have precise predictions in order to 
understand the limits of the standard model. 


Finite-Temperature QCD and the Deconfinement 
Transition 


The static thermodynamics of a system of fields is 
described by the partition function 


Zr = trlexp( C H/T)]| [26] 


It is easy to show that Zr is equal to the Euclidean 
Feynman integral on the imaginary time interval 
(0, 1/T) with boundary conditions in time periodic 
for bosons and antiperiodic for fermions. Indeed, the 
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Boltzmann factor is formally an imaginary time 
evolution by 1/T. A lattice of extension L;L? with 
L; > L; provides the partition function at a tem- 
perature T —1/aL,, if a is the lattice spacing in 
physical units. 

Finite-temperature simulations are important to 
investigate the transition from the phase in which 
color is confined to a phase in which quarks and 
gluons can propagate as free particles. This phase is 
called deconfined phase or quark gluon plasma. 

Big experiments at Brookhaven and at CERN are 
looking for this phase transition in high-energy 
collisions between heavy nuclei, but no definite 
evidence has yet been produced for it. Lattice 
simulations instead definitely prove that such a 
transition exists. For pure SU(3) gauge theory 
(quenched) at T ~% 270 MeV, a first-order phase 
transition is observed, at which the string tension 
vanishes. In a more realistic theory with 
dynamical quarks, a transition is also observed at 
T ~ 160 MeV, where chiral symmetry, which is 
spontaneously broken at zero temperature, is 
restored. This transition is also associated to decon- 
finement even if, in the presence of light quarks, the 
string tension does not exist. Indeed, when pulling 
apart a quark and an antiquark, an instability for 
production of quark-antiquark pairs sets in when 
the potential energy becomes large enough, which 
physically manifests itself as a production of light 
mesons. An alternative order parameter is needed. 
The possibility of defining alternative order para- 
meters is discussed in next section. 

The equation of state can also be studied relating 
internal energy to pressure, which is useful to 
understand heavy ion collisions. 

From the features of the deconfinement transition, 
information can be extracted on the mechanisms by 
which QCD confines color. 

A connected issue is the behavior of QCD at 
nonzero baryon density or chemical potential. The 
corresponding thermodynamics is described by a 
grand canonical ensemble 


Z,- trlexp[-(H + uN)/T]| [27] 


where N = f d?xv/» is the baryon number operator 
and j the chemical potential. In the process of 
converting the partition function Z, into a Feynman 
integral, the term H at the exponent of eqn [27] 
generates the Euclidean action, which is real. The 
term proportional to N becomes imaginary. The 
integral is well defined, but the analogy with a four- 
dimensional statistical mechanics is broken, the 
effective Hamiltonian being non-Hermitian and no 
sampling can be made. Approximate methods have 
been developed, but the problem is open. Exploring 


numerically the region of phase space with p 40 
would be interesting, since a rich structure is 
expected, which could be relevant to dense systems 
such as neutron stars. 


Mechanisms of Color Confinement 


Understanding how QCD manages to confine color 
is one of the most fascinating problems in field 
theory. 

To prove confinement, one should, in principle, 
prove that, at zero temperature, no gauge-invariant 
quasilocal operator exists, carrying nontrivial color 
and obeying cluster property at large distances. This 
proof is not known. There exists evidence form 
lattice simulations that a string tension exists, as 
discussed before. In any 'case, a guess can be made of 
the physical mechanism of confinement. If confine- 
ment is an absolute property reflecting a symmetry 
property of the vacuum, an order parameter should 
exist which discriminates between confined and 
deconfined phase, and the transition between the 
two phases has to be a true transition. Observing a 
crossover in some part of the boundary between the 
two phases would disprove this view. A lattice 
determination of the order of the deconfining 
transition is therefore of fundamental importance. 

A possible mechanism of confinement proposed by 
G’t Hooft is dual superconductivity of the vacuum: 
dual means interchange of electric with magnetic 
with respect to ordinary superconductors. In the same 
way as the magnetic field is constrained into 
Abrikosov flux tubes in an ordinary superconductor, 
the chromoelectric field acting between a quark and 
an antiquark would be constrained into flux tubes by 
a dual Meissner effect producing an energy propor- 
tional to the distance, or a string tension. 

This mechanism can be investigated by lattice 
simulations, by checking if any magnetically charged 
operator exists whose vacuum expectation value is 
nonzero in the confined phase signaling condensation 
of magnetic charges and zero in the deconfined phase. 
Progress has been made in this direction. which, 
however, is not yet conclusive. Chromoelectric flux 
tubes between q-4 pairs are observed in lattice field 
configurations. 


Topology 


Euclidean QCD admits classical solutions with finite 
action and with a nontrivial topology which makes 
them stable. These solutions, known as instantons 
or multi-instantons, realize a mapping of the three- 
dimensional sphere at infinity on the gauge group, and 
the topological charge is the winding number of this 
mapping. The Jacobian of this mapping is the Chern 


current K,, and its divergence 0,,K,,(x) = Q(x) is the 
density of topological charge. Q= f d*x O(x) is the 
topological charge which has integer values. 
Explicitly, 


-E GG 28 
Q(x) =7 eat po Giv] [28] 


n 


with G7, = (1/2)e,,,, Gpo the dual field strength tensor. 
O(x) plays an important role in hadron physics, 
being related to the anomaly of the flavor singlet 
axial current J^ = 3^, v^», vy. J} is conserved at the 
classical level in the chiral limit 7:7 —0, but this 
symmetry does not survive quantization. In fact, 


aJa = 2N;Q(x) [29] 


A consequence of eqn [29] is the high mass my ~ 
1 GeV of the flavor singlet partner 7 of the 
pseudoscalar flavor octet. An N, —oo argument by 
Witten and Veneziano relates 777, to the response of 
the quenched (no quark) vacuum to topological 
excitation, the topological susceptibility x = [d'x < 
0|TO(x)Q(0)|0 >. The relation is 


2N; 
Á 


This approximate relation has been checked on the 
lattice. x has been determined by different methods 
which agree in confirming it. This is an important 
verification of QCD. 

Instantons are stable solutions in the continuum, 
approximately stable in the lattice discretized ver- 
sion. À cooling procedure which locally freezes 
short-distance quantum fluctuations would leave 
the instantons untouched if they were stable. On 
the lattice the instanton is stable anyhow if the 


(= [m3 + m? — 2mip + O(1/N.)] [30] 
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distance in correlation reached by the local cooling 
procedure is small compared to the size of the 
instanton: cooling is indeed a diffusion process and 
the distance involved grows as the square root of the 
number of cooling iterations. Instanton configura- 
tions can nicely be exposed by cooling. 


See also: Anomalies; Quantum Chromodynamics; 
Renormalization: General Theory; Spin Foams; 
Symmetry Breaking in Field Theory. 
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Introduction 


The Leray-Schauder theory gives a powerful and 
versatile continuation method for proving the 
existence, multiplicity, and bifurcation of solutions 
of nonlinear operator, differential and integral 
equations. Let X and Y be topological spaces, A C X, 
f:X— Y, a continuous mapping, and y € Y. The 
fundamental idea of a continuation method to solve 


the equation f(x) =y in A consists in embedding it into 
a one-parameter family of equations 


F(x, A) = z(A) 1] 


where the continuous functions F: X x [0,1] ^ Y, 
z:[0, 1] — Y are chosen in such a way that F(-, 1) = 
f,z(1)— y and 


l. equation F(x,0) —z(0) has a nonempty set of 
solutions in A; 

2. one of those solutions at least can be continued 
into a solution in A of [1] for each A € [0,1]. 


Simple examples show that Assertion 2 can be 
violated when all solutions of |1] leave A after some 
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A* € |0,1[. A way to avoid such a situation consists 
in “closing the boundary,” through the “boundary 
condition”: 

F(x, A) # 2(A) 


foreach (x,A) € OA x [0,1] 


When this condition is satisfied, Assertion 2 can 
still fail when two existing solutions for A small 
disappear after coalescing at some Ay < 1. Losing all 
solutions through this process can be eliminated by 
reinforcing Assumption 1 into 


2’. Equation F(x, 0) —z(0) has a “robust” nonempty 
set of solutions in A. 


This statement can be made precise through the 
concept of topological degree of a mapping, an 
“algebraic” count of the number of its zeros. In a 
finite-dimensional setting, this concept was intro- 
duced by Kronecker for smooth mappings and 
by Brouwer for continuous mappings. Its extension 
by Leray and Schauder to some classes of mappings 
in Banach spaces made much wider applications 
to nonlinear differential and integral equations 
possible. 


Topological Degree of a Mapping 


If UC R" is a bounded open set, z € R" and 
F:U — R” is a C! mapping such that z ¢ F(OU) 
and det F(x) #0 on F'(z), the Brouwer degree 
deg, |F, U,z] is defined (analytically) by 


deg, [F, U, z| := sign det F'(x) 


xcF-!(z) 
xe€F-! (z) 


where o(x) is the sum of the multiplicities of the 
negative eigenvalues of F'(x) The case of a 
continuous F such that z Z F(OU) is treated by 
approximating F through mappings of the above 
type, and showing that the corresponding degrees 
stabilize to an unique value, defining deg, |[F, U, z] in 
the general case. This number remains the same 
under sufficiently small perturbations of F and/or z, 
which expresses the *robustness" mentioned above. 
When 7z—2 and U is bounded by a closed Jordan 
curve, then deg,[F, U,0] is nothing but the winding 
number of F/||F|| along OU. 

Leray and Schauder have extended Brouwer 
degree to the important class of compact perturba- 
tions of identity in a normed space. A compact 
mapping f:A-— B between metric spaces is a 
continuous mapping on A such that f(A) is relatively 
compact. If f : A — B is continuous and compact on 


each bounded B C A,f is called “completely con- 
tinuous” on A. 

If X is a real normed space, U C X an open bounded 
set, /:U— X compact, and z¢(I—f)(QU), the 
Leray-Schauder degree deg, ;[I — f, U,z] of I — f in 
U over z is constructed from Brouwer degree by 
approximating the compact mapping f over U by 
mappings f, with range in a finite-dimensional sub- 
space X, of X containing z. One shows that the values 
of the Brouwer degrees deggl(! — f.)]y , UN Xa z] 
stabilize for sufficiently small positive c to a common 
value which defines deg, ;[I — f, U, z]. 

Again, this topological degree is an algebraic 
count of the number of elements of (I — f) ‘(z), 
equal to 0 when z ¢ (I — f)(U). When f is of class 
C!, and I — f'(x) invertible at each fixed point x € 
(I—f)'(z),(1—f) '(z) is finite and the 


Leray-Schauder formula holds: 


deg; s|! — f, U,z] = » 


xe(I—f) '(z) 


(1 g 


where a(x) is the sum of the algebraic multiplicities 
of the eigenvalues of f'(x) contained in [1, +00]. 
Let 1— [0,1]. For A C Xx I, and A € I, we write 
A), — [x € X:(x,A) € A}. The Leray-Schauder degree 
inherits the basic properties of Brouwer degree: 


1. Additivity. If U= U U U2, where U; and U> 
are open and disjoint, and if zé (I —f)(QU;) U 
(I — f)(OU2), then 
deg, ;[/ — f, U,z] = deg;slI — f. U1, 2] 
+ deg; s[J — f. U2, z] 


ho 


. Existence. If 
z € (I — f)(U). 
3. Homotopy invariance. Let QC XxlI be a 
bounded open set, and let F:() — X be compact. 
If x= F(x,A)zz for each (x,A)€ 00, then 

deg; s[/ — F( -, A), Q), z| is independent of A. 


degis[] — f,U,z] #0, — then 


In particular, if a is an isolated fixed point of f, 
and B(a,r) denotes the open ball of center a and 
radius r,deg,;[I — f, B(a,7),0] is defined and inde- 
pendent of r for sufficiently small r > 0. Its value is 
called the *Leray-Schauder index" of I — f at a, and 
denoted by indijs[! — f,a]. 


Fixed-Point Theorems for Compact 
Perturbations of Identity in a Normed 
Space 


An important application of Leray-Schauder degree 
is the obtention of general fixed point theorems for 
compact mappings in normed spaces based on 


continuation along a parameter. If F:AC 
X x I X, we denote by X^ the (possibly empty) 
solution set defined by 


X^ = fiw Ale Asx = Flex, X)} 


Let (CXxI be a bounded open set and 
F:Q—+X be a compact mapping. The general 
Leray-Schauder fixed-point theorem goes as follows: 


Theorem If the following conditions bold: 


(i) X" 00-0 (a priori estimate) 

(ii) deg, [J — F(-,0), 09,0] Z O (degree condition), 
then X contains a continuum C along which 
A takes all values in I. In other words, X? 
contains a compact connected subset C connect- 
ing XF to Qı. If one refines Assumption (ii) into 
XO is a finite nonempty set {a,,...,a,} and 
indis|] — F(-,0),a;] Z0, the conclusion takes 
the form of an "alternative": if assumptions 
(i) and (11) hold, then (a,,0) belongs either to 
a continuum in X? containing one of the points 
(a2,0),...,(44,0), or to a continuum in X 
along which X takes all the values in I. 


x 


(11 


Condition (iii) automatically holds in the following 
important special case: If X" 900 —0, F(-,0) —0, 
and 0 € Qo, then X* contains a continuum C > (0,0) 
along which A takes all values in J. When dealing with 
the fixed-point problem x =f(x) with f:U c X— X 
compact, U open and bounded, a natural choice is 
F(x, A)=Af(x),Q2=U x I, giving the statement: If 
0 € U and if x Z Af(x) for each (x, A) € OU x I, then 
((x, à) € U x [:x=Af(x)} contains a continuum C 5 
(0,0) along which A takes all values in J, 

Condition (i) requires the a priori knowledge of 
the localization of the solution set ©” and is in 
general very difficult to check. An important special 
case occurs when X* is a priori bounded: if F is 
completely continuous on X x I, F(-,0)=0, and 
Y^ c Bir) x I for some r > 0, then X* contains a 
continuum C 2 (0,0) along which A takes all values 
in I. Its special case with F(A, x) - Af(x) can be 
stated as Schaefer's alternative: Let f: X— X be 
completely continuous. Then either there exists, for 
each A€[0,1], at least one x€X such that 
x=Af(x), or the fixed point set (xex: 
x =Af(x),0 < A < 1} is unbounded in X. Schaefer's 
alternative is equivalent to the following Schauder 
fixed-point theorem: 


Theorem Any compact mapping f : B(r) — B(r) bas 
a fixed point. 


A simple consequence of Schauder's theorem is 
that, for any continuous and bounded g:R — R, 
any open bounded D C R”, any A different from an 
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eigenvalue of —^ on D with Dirichlet boundary 
conditions, the nonlinear Dirichlet problem 

Au + àu + g(u)= h(x) in D 
4—0 ondD 


has a weak solution for each b € L^(D). 

An interesting consequence of Leray-Schauder 
theorem with X* a priori bounded is that, for any 
bounded domain D c R” with ðD of class C*, the 
Dirichlet problem for the equation of surfaces with 
constant mean curvature A 


(14- ||Vul|7)Au — > Oju Oju Ozu 
ij=! 
= nM + || Vult)? 


has a unique solution for arbitrary smooth boundary 
data if and only if the mean curvature of the boundary 
OD is everywhere greater than [n/(n — 1)]|Al. 

The use of auxiliary continuous functionals gives 
a fixed-point theorem in the absence of a priori 
bounds: 


Theorem (Capietto-Mawhin-Zanolin). Let Q C 
X x I be an open set and F:0 — X be completely 
continuous. If XX! is bounded, deg,s|I — F( - ,0), 
Uo9,0| Z0 for some open bounded neighborhood 
Uy of Xv, and if there exists a continuous 
mapping ~:X x 1—R,, proper on X9, and c_ < 
minso y(-,0) € maxyo y(-,0) < c4 such that Y? g 
(c_,c,} and X! g[c ,c,], then X? contains a 
continuum C along which À takes all values in I. 


This result implies, for example, that for g: IR — R 
continuous, odd and superlinear (limi, — s g(u)/ 
4—--oc), and p:[0,1] x R% with at most linear 
growth in u and u’ at infinity, the two-point 
boundary-value problem 


u" + g(u) = p(t,u,u’), u(0) =u(1)=0 


has, for all sufficiently large j, at least one solution 
u; having exactly 7;+1 zeros on [0,1], and 
lillo — oo if j — oo. 


Extensions of Leray-Schauder degree 


Fixed-point theorems for operators between suitable 
nonlinear spaces can also be proved using topologi- 
cal continuation arguments. For example, if C C X 
is a nonempty convex set, one has the following 
extension of a result of the previous section to 
mappings in C: if UCC is open and bounded, 
F:clcU x I — C compact and such that x Æ F(x, A) 
for each (x, A) €OcU x I, F(-,0)—xo € U, then 
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F(-,A) has a fixed point in U for each A € I. The 
special case where C is a wedge is useful in finding 
positive solutions of nonlinear differential or inte- 
gral equations. For nonlinear spaces, the degree has 
to be replaced by the fixed-point index, 
which generalizes both the *Hopf-Lefschetz num- 
ber" and Leray-Schauder degree. 

The Leray-Schauder degree also has been 
extended to other classes of operators. Compact 
operators can be replaced by k-set-contractive or 
condensing mappings f, with respect to various 
measures of noncompactness, and fixed-point pro- 
blems can be replaced by problems of the form x € 
F(x) for multivalued mappings F. Equivariant degree 
theories have been developed when U is invariant 
and f equivariant with respect to the action of some 
compact Lie group G on X. The special case of 
G=S' is of special importance in the study of 
periodic solutions of autonomous differential sys- 
tems. Degree theories have also been constructed for 
various classes of mappings between two different 
Banach spaces or manifolds, which include mono- 
tone-like and nonlinear Fredholm operators. We just 
describe a simple but useful situation in this 
direction. 

Many differential equations, when expressed as 
equations in an abstract space, do not have the 
fixed-point form but can be written as Lx — Nx with 
L:D(L)C X —Z linear, N:U— Z, X and Z real 
normed spaces. If L is invertible, the equation is 
trivialy equivalent to the fixed-point problem 
x—L' Nx, to which Leray-Schauder theory can be 
applied when LN is compact. The situation is 
more delicate when L has no inverse. If L is a linear 
Fredholm mapping of index zero (its range R(L) is 
closed and has a finite codimension equal to the 
dimension of its null space N(L)), the set F(L) of 
linear continuous mappings of finite rank A: X => Z 
such that L-- A: D(L) —^ Z is a bijection is none- 
mpty and the compactness of (L+ A) 'G does not 
depend upon the choice of A € F(L). G is then called 
*L-compact" on E, and *L-completely continuous" 
on E when compact on each bounded set of E. 

The following continuation theorem for perturbed 
Fredholm mapping of index zero holds. 


Theorem Let OC X xI be open and bounded, 
L:D(L)C XZ linear Fredholm of index zero, 
N:Q — Z L-compact, and let X: —((x,A) € (D(L) x I) 
nO: Lx - N(x,A)). If 


(i) ENƏN z 0 (a priori estimate), 
(ii) N(Qo x {0} C Y, with Y & R(L) =Z (transvers- 
ality condition), and 
(iii) degpIN(- , Olkezs Qo N kerL, 0] # 0 
condition) 


(degree 


then X contains a continuum C along which X takes 
all values in I. 


When dealing with equation Lx=f(x) with f 
L-completely continuous, an interesting special case 
of the above result follows from the choice 
N(x, A)  M(x) -- (1 —AOf(x), with Q:Z—Z a 
projector such that N(O) — R(L). In this case, the 
homotopy is equivalent to 


Lx = M(x) (A €]0,1]) 
Of(x)-0; xeN(L) (A-0) 


An application (among many) of this result, 
for g:R-—R continuous such that —o < 
limsup, , glu) < liminf, 4. g(u) < +o0,D C R” 
open, bounded, A, an eigenvalue of the Dirichlet 
problem for —A on D, is'the weak solvability of the 
nonlinear problem 

Au + A,u+g(u)=h(x) in D 
u=0 on ðD 


for each h € L?(D) such that 


| b(x)e(x) dx < Him sup eu) 
JD 


X | ox) dx - |lim inf s)] | e o dx 
JD H— 4-00 D 


for all eigenfunctions associated to Aj. The 
addition of the nonlinearity g “widens” the range 
{he L*(DJs f. ho —0] of the corresponding linear 
problem. 


Bifurcation Theory 


Leray-Schauder degree is a powerful tool in bifurca- 
tion theory, where, given a family F of solutions, 
one tries to detect and analyze other ones branching 
or bifurcating from F. Consider the equation 


x = ALx + R(x, A) [3] 


in a real normed space X, where L: X — X, linear, 
and R: X x R — X are completely continuous, and 
R(0,A)=0 for each A€ R. Thus, {(0,A):A € R} is 
the trivial solution set of [3]. A bifurcation point 
(A*,0) for [3] is the limit of a sequence (Agp, x4) of 
solutions of [3] in R\{O}. 


If 
lig RC ME _ 9 
x0 |æ 
uniformly on bounded A-sets [4] 


it is easy to prove that if (A*, 0) is a bifurcation point for 
[3], then A* is a characteristic value (reciprocal of an 
eigenvalue) of L. Leray-Schauder theory gives a partial 
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converse to this result known as Krasnosel'skii's 
bifurcation theorem: 


Theorem For each real characteristic value A* of L 
with odd algebraic multiplicity, (X ,0) is a bifurcation 
point of [3]. Of fundamental importance in the proof is 
the special case of [2] with f = L and N(I — L) = {0}. 


Another fruitful concept is Krasnosel’skii’s bifur- 
cation from infinity. We say (A*, 0) is a bifurcation 
point for [3] if there exists a sequence (An, Xn) of 
solutions of [3] such that A,,— A* and ||x,,|| ^ oc. 
The corresponding bifurcation result goes as follows 
(Krasnosel'skii): if 


s. IRENI o 
Ix — fixe] 
uniformly on bounded A-sets [5] 


then, for each real characteristic value \* of L with 
odd algebraic multiplicity, (A*,oo) is a bifurcation 
point of [3]. 

Global versions of Krasnosel’skii’s theorems can be 
given, whose statements are reminiscent of Leray- 
Schauder's alternative theorem. Let S denote the 
closure in R x X of the set of (A,x) € R x (X \ (0]) 
satisfying. [3]. For bifurcation from zero, one has 
Rabinowitz global bifurcation theorem: 


Theorem If [4] holds and X* is a real characteristic 
value of L with odd algebraic multiplicity, then S 
contains a component C which either is unbounded, 
or contains (X**,0), where A** + X* is a character- 
istic value of L. 


As an application, one can show that the non- 
linear Sturm-Liouville problem 


—(p(x)u ) -- q(x)u = Aa(x)u-4- b(x,u,u',A) (x€]0.1]) 
agu(0) + bou (0) — aqu(1)4- b1u'(1) —0 


with p € C! positive, q, a, b continuous, a positive, 
(a,+b3)(a7+b7)40 and b(x,u,v)—o(|u|2-|v|) if 
|u| -- |v| + 0 uniformly on compact A-intervals, has, 
for each REN, an unbounded component of 
solution C, in R x C!([0,1]) emanating from (A,,0), 
with A, an eigenvalue of the problem with 5-0 
(Rabinowitz). 

One has also global bifurcation from infinity: if 
[5] holds and if A* is a real characteristic value of L 
with odd algebraic multiplicity, then [3] has an 


unbounded component of solutions D which con- 
tains (A*, 00). 


See also: Bifurcation Theory; Bifurcations in Fluid 
Dynamics; Bifurcations of Periodic Orbits; Minimal 
Submanifolds; Minimax Principle in the Calculus of 
Variations; Partial Differential Equations: Some 
Examples; Riemann-Hilbert Problem; Topological 
Defects and Their Homotopy Classification; Viscous 
Incompressible Fluids: Mathematical Theory. 
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Introduction 


Local continuous transformations were introduced 
by Lie as a tool for solving ordinary differential 
equations. In this program, he followed the spirit of 
Galois, who used finite groups to develop algo- 
rithms for solving algebraic equations (the general 
quadratic, cubic, and quartic), or else to prove that 
some equations (the generic quintic) could not be 
solved by quadrature. 

Lie’s work led eventually to the definition and 
study of Lie groups. Lie groups are beautiful in their 
own right — so beautiful that they have been studied 
independently of their origin as a tool for solving 
differential equations and studying the special 
functions determined by certain classes of these 
equations. 


Lie Groups 


Lie groups exist at the interface of the two great 
divisions of mathematics: algebra and topology. 
Their algebraic properties derive from the group 
axioms. Their geometric properties arise from the 
parametrization of the group elements by points in a 
differentiable manifold. The rigidity of these struc- 


tures arises from the continuity requirements 
imposed on the group composition and inversion 
maps. 


The algebraic axioms are standard. 


Definition A group G consists of a set 
gi» Zj, gb...€G together with a combinatorial 
operation o that satisfy the four axioms: 


(i) Closure. If g; € G, gj € G, then giog; € G. 
(ii) Associativity. If gj, gi, g} € G, then (gio g;)o 
£k = gi o (8j © 8k). 
(ui) Identity. There is a unique operation e € G that 
satisfies e o g; = gj — g; o e. 
(iv) Inverse. Every group operation g; € G has an 
inverse, denoted g;, that satisfies g; o g;! =e = 


8; Ogi 
Lie groups have more structure than groups. In 
particular, each g; € G is a point in an z-dimen- 
sional manifold M”. That is, the subscript 1 
actually identifies a point x € M", so that we 
can write g;=g(x) or most simply g;=x. 
The group multiplication can be expressed in the 


form gj o gj = gy — g(x) o g(y) 2 g(z), where x € M", 
yc M",z—ó(x,y) € M". The group inversion map 
can be expressed in the form g(x) — g(x) ! — g(y), 
y—w(x)e M". The topological axioms for 
Lie groups can be taken as: 


(v) Continuity of composition. The mapping 
z— (x,y) defined by the group composition 
law is differentiable. 

(vi) Continuity of inversion. The mapping y — v(x) 
defined by the group inversion. law is 
differentiable. 


The dimension of the Lie group is the dimension 
of the manifold that parametrizes the operations in 
the group. 

The most familiar examples of Lie groups consist 
of n x n nonsingular matrices over the fields R, C, O 
of real numbers, complex numbers, and quaternions. 
For example, the set of 2 x2 real unimodular 
matrices 


a b 
F ri aa—be=1 
is a three-dimensional submanifold embedded in 
R? —R*, 


Matrix Lie Groups 


Not every Lie group is a matrix group. Yet, it is a 
surprising and useful result that almost every Lie 
group encountered in physics is a matrix Lie 
group. These are all subgroups of the general 
linear groups GL(n;F) of nxn nonsingular 
matrices over the field F (R, C, Q). These groups 
have real dimension z^ x (1,2, 4), respectively. The 
special linear subgroups SL(n; F) are defined as the 
subgroups of nxn matrices with determinant 
+:M € SL(n; F) if det M—--1. This definition is 
problematic for quaternions, as they do not 
commute. To avoid this problem, it is useful to 
map quaternions into 2 x 2 complex matrices in 
the same way complex numbers can be mapped 
into 2 x 2 real matrices: 


7 | i 
) — 
ati ie 


Iq1-- qa 
qo — 193 


: + iq: 
go +Iqiı + Jq2+Khq3— b v5 
Idi — q2 


Here (1,i) are basis vectors for C! considered as 
a real two-dimensional linear vector space, 


(1, Z, J,K) are basis vectors for Q! considered as a 
real four-dimensional linear vector space, and (a,b) 
and (qo,q1,q2, 43) are all real. The squares of the 


' . ^e . e 

imaginary quantities 1 and Z, J, K are all —l:i = —1; 
T? = J* =K? =—1 and the imaginary quaternion 
basis elements anticommute: [T s) SS 


{7,K}={K,Z}=0. The unimodular subgroup 
SL(z; Q) of GL(z; Q) is obtained by replacing each 
quaternion matrix element by a 2x2 complex 
matrix, setting the determinant of the resulting 27 x 
2n matrix group to -H, and then mapping each of the 
n? complex 2 x 2 matrices back to quaternions. 

Many other important groups are defined by 
imposing linear or quadratic constraints on the n? 
matrix elements of GL(m;F) or SL(m;F) The 
compact metric-preserving groups U(m;F) leave 
invariant lengths (preserve a positive-definite metric 
g—1,) in linear vector spaces. The matrices M € 
U(n; F) satisfy MTM =I). These conditions define 
the orthogonal groups O(m)=U(n;R) and the uni- 
tary groups U(m)=U(n;C). Their noncompact 
counterparts O(p,q) and U(p,q) leave invariant 
nonsingular indefinite metrics 


in real and complex n= (p + q)-dimensional linear 
vector spaces: M'I, aM = I, g. 

Intersections of matrix Lie groups are also Lie 
groups. The special metric-preserving groups are 
intersections of the special linear groups SL(#; F) C 
GL(z;F) (with F-—OQ,SL(z;O) is defined as 
described above) and the metric-preserving sub- 
groups U(z; F) C GL(n; F): 


SL(n: R) à U(n; R)= SO(n), 
SL(«;G)1U(s; C)= SU(), 
SL(»n;Q)nQnU(m Q)-—Sp(n) 3USp(2z2),  n(2n4- 1) 


The real dimensions of these groups are given in the 
right-hand column. Under the replacement of qua- 
ternions by 2 x 2 complex matrices, the group of 
nxn metric-preserving and unimodular matrices 
Sp(z) over O is identified as USp(27)), an isomorphic 
group of 2n x 2n matrices over C. 

Noncompact forms SO(p, 4), SU(p, q), and Sp(p, q) = 
USp(2p, 24) are defined similarly. 

The Lie group SU(2) rotates spin states to spin 
states in a complex two-dimensional linear vector 
space. It leaves lengths, inner products, and 
probabilities invariant. If an interaction is spin 
independent, only an invariant (“Casimir invar- 
iant”) constructed from the spin operators can 
appear in the Hamiltonian. The same group can act 
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in isospin space, rotating proton to neutron states. 
The Lie group SU(3) similarly rotates quark states 
or color states into quark states or color states, 
respectively. The Lie group SU(4) rotates spin- 
isospin states into themselves. The conformal group 
SO(4, 2) leaves angles but not lengths in spacetime 
invariant. It is the largest group that leaves the 
source-free Maxwell equations invariant. It is also 
the largest group that transforms all the (bound, 
scattering, and parabolic) hydrogen atom states 
into themselves. 

Lie groups such as the Poincaré group (inhomo- 
geneous Lorentz group) and the Galilei group have 
the matrix structures 


X 
Yy 
& 
ct 
1 
X 
y 
Z 
r 
1 


respectively. In these transformations f= (t1, t2, t3) 
describes translations in the space (x-, y-, and z-) 
directions, v= (v1, v2,v3) describes boosts, and t4 
resets clocks. The matrices in these defining matrix 
representations are reducible. 

The Heisenberg covering group H4 is a four- 
dimensional Lie group with a simple 3 x 3 matrix 
structure: 


1 1 d 
Heisenberg covering group = H4— |O n rj, 
0.0 1 
n0 
This matrix representation of H4 is faithful but 
nonunitary. 


“Linearization” of a Lie Group 


At the topological level, a Lie group is homoge- 
neous. That is, every point in a manifold that 
parametrizes a Lie group looks like every other 
point. At the algebraic level, this is not true — the 
identity group operation e is singled out as an 
exceptional group element. At the analytic level, the 
group composition law z= ó(x, y) is nonlinear, and 
can therefore be arbitrarily complicated. 
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The study of Lie groups is enormously simplified 
by exploiting these three observations. Specifically, 
it is useful to “linearize” the group multiplication 
law in the neighborhood of the identity. The 
linearization leads to a local Lie group. This is a 
linear vector space on which there is an additional 
structure. Once the local Lie group properties are 
known in the neighborhood of the identity, they are 
known everywhere else in the group, since the group 
is homogeneous. 

A Lie group is linearized in the neighborhood of 
the identity by expressing an operator near the 
identity in the form g(c) — 1 + eX, where the local 
Lie group operator «X=6x'X;, the X; are n 
linearly independent vector fields on the manifold 
M", and the small coordinates 6x’ measure the 
distance (in some rough sense) of g(«) from the 
point that parametrizes the identity group opera- 
tion e—g(0). For another group operation 
g(6Y) — I -- 6óY in the neighborhood of the identity, 
the following holds. 


1. The product g(eX)g(6Y) — (I 4- eX) --6óY) 2 I4 
(eX + 6Y) + (h.o.t) is in the local Lie group. 

2. The commutator gjog;og;'og;! in the group 
leads to 


g(cX)g(6Y)g(eX) 'g(6Y) ' 
= [ - 4 eó(XY — YX) hot 
=1+466[X, Y] + h.o.t 


in the local Lie group. 


The first condition shows that the local Lie 
group is a linear vector space. The n vector fields 
X; can be chosen as a set of basis vectors in this 
space. 

The second condition shows that the commutator 
of two vectors in this linear vector space is also in 
this linear vector space. The commutator endows 
this linear vector space with an additional combina- 
torial operation (*vector multiplication") and pro- 
vides it with the structure of an algebra, called a Lie 
algebra. 


Definition A Lie algebra (a consists of a set of 
operators X, Y, Z,..., together with the operations 
of vector addition, scalar multiplication, and com- 
mutation. [X,Y] that satisfy the following three 
axioms: 


(i) Closure (linear vector space). If X, Y € la, o.X + 
BY € la and [X, Y] € Ia. 
(un) Aztisymmetry. |X, Y| 2 —|Y,X]. 
(iii) Jacobi identity. [X,|Y,Z]]+ [Y [Z, X]] + 
Z. IX, Y]] 0, 


The structure of a Lie algebra, or local Lie group, 
is summarized by the structure constants, defined in 
terms of the basis vectors X;, by 


X es cj'X, summation convention 


The structure constants cj^ are components of a 
third-order tensor, covariant and antisymmetric 
in two indices (c^ = ~ ei") and contravariant in 
the third. These components obey the Jacobi 
identity, which places a quadratic constraint on 
them: 


E t S. X a y dx 
Cj Cok + Cik Csi + Cki Cj =O 


Linearization of a Lie group generates a Lie 
algebra. A Lie group can be recovered by the 
inverse process. This is the exponential operation. 
A group operation a finite distance from the origin 
(the point identified with the identity group opera- 
tion) of the manifold that parametrizes the Lie 
group can be obtained from the limiting procedure 
(e=1/K — 0): 


g(X) = lim 


K—2oc 


E. x 
(rex) = e* = EXP(X) 


The exponential operation is well defined for real 
numbers, complex numbers, quaternions, n xn 
matrices over these fields, and vector fields. 

A 1:1 correspondence between Lie groups and Lie 
algebras does not exist. Isomorphic Lie groups have 
isomorphic Lie algebras. But nonisomorphic Lie 
groups may also possess isomorphic Lie algebras. 
The best known examples of nonisomorphic Lie 
groups and their isomorphic Lie algebras are 


SO0(3) Æ SU), so(3) = su(2) 
SO(4) 4 SU(2) x SU(2), ( 
SO(5)zSp(2)—USp(4), $0(5)—$p(2) = usp(4) 


There is a 1:1 correspondence between Lie algebras 
and “locally” isomorphic Lie groups. This has been 
extended to global Lie groups by a beautiful 
theorem due to E Cartan. 


Theorem (Cartan) There is a 1:1 correspondence 
between Lie algebras and simply connected Lie 
groups. Every Lie group with the same Lie algebra 
is either the simply connected (“universal covering") 
group or is the quotient of this universal covering 
group by one of its discrete invariant subgroups. 


This relation is summarized in Figure 1. 

As a concrete example, the Lie algebra of 
SO(3), which is the group of real 3 x 3 matrices 
satisfying M'I3M-—Is3 and det(M)— +1, is 
spanned by the three *angular momentum vector 
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Simply connected 


Lie group 
SG 


Multiply JL 
iii ein SG/D, SG/D, 
groups 


: 

v 
"E 
E= B- 
Oog||& 
"Oc = eee SG/D, 
NS} | 

x 
D ft 
£ 
- 


Lie, 


algebra, 
q 


Figure 1 Cartan’s theorem states that there is a 1:1 correspondence between Lie algebras and simply connected Lie groups. All 
other Lie groups with this Lie algebra are quotients of the covering group by one of its discrete invariant subgroups D; C Dmax- There is 
a relation between the discrete invariant subgroup Dj and the homotopy group of SG/Dj. Reproduced with permission from Gilmore R 
(1974) Lie Groups, Lie Algebras, and Some of Their Applications. New York: Wiley. 


fields" L;(x)—  ej,x'Óy or the three angular 
momentum matrices 


L4 = bg 


| 
o 


L5 = L3; = -Ly = 


0 +1 0 
-1 0 0 
0 0 0 


La — Lia 


The Lie group SU(2) is the group of complex 2 x 2 
matrices satisfying M'I5M = I; and det(M) = +1. Its 
Lie algebra is spanned by the three spin matrices 
$; — (1/2)o;, which are multiples of the Pauli spin 
matrices gj: 


The two Lie algebras are isomorphic as they share 
isomorphic commutation relations [/1,/;] ^ —/3 (and 
cyclic), /; 2 L; or J; =S;. The group SU(2) is simply 
connected. Its maximal discrete invariant subgroup D 
consists of all multiples of the identity, al, so that 
a= +1. According to Cartan’s theorem, SO(3) — 
SU(2)/D5, D2 = (I5, —In}. The group SO(3) is doubly 
connected, with a two-element homotopy group. 


Matrix Lie Algebras 


A deep theorem of Ado guarantees that every Lie 
algebra is equivalent to a matrix Lie algebra, even 
though the same is not true of Lie groups. 

Sets of n xn matrices that close under vector 
addition, scalar multiplication, and commutation 
(Mı € la, M; € la > [Mij, M3| - MI M5 — MM; € [a) 
form matrix Lie algebras. The antisymmetry proper- 
ties and Jacobi identity are guaranteed by matrix 
multiplication. 

Lie algebras for the general linear groups 
GL(z;F) consist of nxn matrices over F. Lie 
algebras for the special linear groups SL(n; F) 
consist of traceless n x n matrices. The Lie algebras 
of the unitary groups consist of anti-Hermitian 
matrices. The Lie algebras of U(p,q; F) consist of 
matrices that obey 

M'Ij4--Iy,M —0, M € u(p,q; F) 
The matrix Lie algebras of other matrix Lie groups 
are obtained by constructing the most general Lie 
group operation in the neighborhood of the identity 
by linearization. For example, the Lie algebra of the 
Heisenberg covering group H4 is 


1 $ d l ól ód 

0 n r|—|O0 1+6n ôr 
0.0 1 0 0 1 
—+1;+é6nN+6rR+6/1L+ 6dD 
N ~x a'a R~al 

0 0 0 0.0 0 

0 1 0 0.0 1] 

0.0 0 0 0 0 
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0 1 O0 

0.0 0 

0.0 0 


The four 3 x 3 matrices N, R, L, D that span the Lie 
algebra D4 of H4 satisfy commutation relations 
isomorphic with the commutation relations satisfied 
by the photon operators (aa, a!, a, I = [a,a!]). The 
3 x 3 matrix representations of the group H4 and 
the algebra b; are faithful. The representation of H4 
is nonunitary and that of 64 is non-Hermitian. 
There is a simple way to relate a large class 
of operator Lie algebras to matrix Lie algebras. 


If A, B, C,... belong to a Lie algebra of nxn 
matrices with [A, B] = C, the matrix-to-operator 
mapping 


A — A — LA O 
preserves commutation relations, for 
[A, B] = [x A/8, x"B,°0,] 
= x Aj |8;, x | ^8, — x' B? [0,, x' | A78; 
=x Aj B?0, — x'B,A/O; = x'[A, B]/ a, =€ 


This relation depends on the bilinear products x'ój 
satisfying commutation relations 


[0,38] = 20.8) — 0 


These commutation relations are satisfied by pro- 
ducts of creation and annihilation operators a.d; for 
either bosons (bib; )) or fermions (fi (f; fi). These matrix- 
to-operator mappings can be extended to include 
bilinear products such as x'x/, x'0;, 0;0; and their 
boson and fermion counterparts 4jd;, a; aj, dla. For 
example, the vector fields associated with the 
operator J; for SO(3) and SU(2) are x(L1)/8; = 

x703 — x?Ó» and u'(S;) jð — (1/2)(u! 05 + u^0,). 

Boson and fermion bilinear products ala;(1 e f 
j € n)are isomorphic to u(n). Boson bilinear products 
bib;, b!bj, bib! are isomorphir to Wahlen) while 
fermion bilinear products fifi f Ef f. are isomorphic 
to $D(2z). 


Structure of Lie Algebras 


The study of Lie algebras is greatly facilitated by 
studying their structure. The structure is determined 
by the commutation properties of the Lie algebra. 


Invariant Subalgebra 


If a Lie algebra has an invariant subalgebra, then 
the commutator of anything in the algebra with 


anything in the subalgebra is in the subalgebra. 
Suppose a is a linear vector subspace of q. 
If [g a] C a, then a is an invariant subspace of qQ. 
In particular, [a,a] C a and a is therefore also 
a subalgebra of q: it is an invariant subalgebra 
in g. 


Example The Lie algebra 150(3) consists of the 
three rotation operators L;=Ẹx'ð;— xð; and the 
three displacement operators P,-—0,. The subset 
of displacement operators is an invariant subspace 
in 1$0(3), since it is mapped into itself by all 
commutators. It is also a subalgebra in 150(3). This 
particular invariant subalgebra is commutative. 


Solvable Algebra 


If q is a Lie algebra, the linear vector space obtained 
by taking all possible commutators of the operators 
in q is called the “derived” algebra: [g, a] ^ a C q. 
If g =q, there is no point in continuing this 
process. If a! C g, it is useful to define q— aq? 
and to continue this process by defining q'?) as the 
derived algebra of g”: g” — [g/, a]. We can 
continue in this way, defining g+" as the algebra 
derived from q'”). Ultimately (for finite-dimensional 
Lie algebras), either a" * 20 or q"*! =q™ for 
some n. If the former case occurs, 
g= gU 5 gt aa? p-e pa 5g ag 

the Lie algebra a) is called solvable. Each algebra 
a is an invariant subalgebra of a, i >j. 


Example The Lie algebra spanned by the boson 
number, creation, annihilation, and identity opera- 
tors is solvable. The series of derived algebras has 
dimensions 4, 3, 1, 0. 


" "ES 
Ez 44 = m 
a a ms = 
y 1 d a 


Semidirect Sum Algebra 


When a Lie algebra q has an invariant subalgebra a, 
the linear vector space of the Lie algebra à can be 
written as the direct sum of the linear vector 
subspace of the subalgebra a plus a complementary 
subspace b. The subspace b is generally not by itself 
a Lie algebra. The Lie algebra q is written as a 
semidirect sum of the two subspaces. The semidirect 
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sum structure satisfies the commutation relations 
shown: 


b,b C b ^a 
Ib,aj Ca 
[a,a] C a 


q—b^a 


The subspace b can be given the structure of an 
algebra modulo the component of the commutator 
in a: b— a mod a. 


Example The three-dimensional Lie algebra spanned 
by the photon operators a', a, I has a semidirect sum 
decomposition where b is spanned by a!, a and a is 
spanned by I. The subspace b is not closed under 
commutation, and a is commutative. The Lie algebra 
1$0(3) also has the structure of a semidirect sum, with 
b—b-50(3) and the invariant subalgebra a is 
spanned by the three displacement operators P,. 


Nonsemisimple Algebra 


A Lie algebra is nonsemisimple if it has a solvable 
invariant subalgebra. 


Example The Lie algebra spanned by bilinear 
products of photon creation and annihilation opera- 
tors d;dj, creation operators al, annihilation opera- 
tors aj, and the identity operator I(1 € i,j € n) 
is nonsemisimple. The solvable invariant subalgebra 
is spanned by the 27 + 2 operators consisting of the 
single photon operators a! aj, the identity operator 
I, and the total number operator ñ= S7”_, aaj. 


Semisimple Algebra 


A Lie algebra is semisimple if it has no solvable 
invariant subalgebras. 


Example The Lie algebra $0(4) is semisimple. This 
Lie algebra has two invariant subalgebras, both 
isomorphic to $0(3). The direct sum decomposition 


$0(4) = $0(3) + 30(3) 


is well known to physical chemists and is respon- 
sible for the dualities that exist between rotating and 
laboratory frame descriptions of molecular systems. 


Simple Algebra 


A Lie algebra is simple if it has no invariant 
subalgebras at all. The prettiest page in the theory 
of Lie groups is the classification theory of the 
simple Lie algebras. We turn to this subject now. 


Lie Algebra Tools 


Two powerful tools have been developed for study- 
ing the structure of a Lie algebra. These are the 
regular representation and the Cartan-Killing form. 


Regular Representation 


This representation assigns the structure constants to 
a set of n n x n matrices according to 


Xa => R(Xa)," — Gan. be Xy) — Cay Ay 


The matrices of the regular representation contain 
exactly as much information as the components of 
the structure tensor. They can be studied by 
standard linear algebra methods. For example, a 
secular equation can be used to put the commuta- 
tion relations into canonical form. 

The structure of the matrices of the regular 
representation determines the structure of the Lie 
algebra. The identification is carried out according to 
the usual rules of representation theory, as shown in 
Figure 2. If a basis X, can be found in which all the 
matrices of the regular representation are simulta- 
neously reducible, the algebra possesses an invariant 
subalgebra. If the representation is not fully reduci- 
ble, the invariant subalgebra is solvable. If the regular 
representation is fully reducible, the algebra consists 
of the direct sum of two (or more) smaller, mutually 
commuting subalgebras. If the regular representation 
is irreducible, the algebra is simple. 

If a Lie algebra is solvable (Sol), all matrices in 
the regular representation can be transformed to 
upper triangular matrices. If the Lie algebra is 
nilpotent (nil C 3olv), the diagonal matrix elements 
in the upper triangular matrices are zero. The 
converses are also true. 


Cartan-Killing Form 


The Cartan-Killing form is a second-order sym- 
metric tensor that is constructed from the third- 
order antisymmetric tensor Cap” by cross-contraction 


Sag = Cou Civ = £60 = tr R(X,)R(X5) = (Xo, X3) 
zm (X5, X.) 


The metric gag can be used to place an inner product 
(X,,Xg) on this linear vector space. This inner 
product is not necessarily positive definite. 


Reducible 


y 


Nonsemisimple 


Fully reducible Irreducible 


7 a" 


Semisimple 


——— a 


Simple 


Figure 2 When the regular matrix representation of a Lie 
algebra is reducible, fully reducible, or irreducible, the Lie 
algebra is nonsemisimple, semisimple, or simple. 
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The matrix g,3 can also be treated by standard 
linear algebra methods. Since it is real and 
symmetric, it can be diagonalized. If there are 
n negative eigenvalues, n, positive eigenvalues, 
and mp vanishing eigenvalues (n —75. +n, + no), the 
Lie algebra has a corresponding linear vector space 
decomposition of the form 


Qq — q- t 8, t do 


The inner product is positive definite on the 
subspace q, and negative definite on q . We call 
(o the singular subspace. The subspace Qo is closed 
under commutation and in fact is a nilpotent 
invariant subalgebra of q. 


Decomposition of Lie Algebras 


The most general Lie algebra q is the semidirect sum 
of a semisimple Lie algebra 55 and a solvable 
invariant subalgebra solv: 


q = 55 A solv 55, so[b] C solv 


solv, solv] C solv 


The decomposition of q into its component parts 
is accomplished by a simple two-step algorithm. 


1. Compute the Cartan-Killing metric for q and 
determine the singular subspace. If there is none, 
stop. If the dimension of qs > 0, nil — aq, is the 
maximal nilpotent invariant subalgebra of q. 

2. Compute the structure constants of the Lie 
algebra  a/'—g-— nil2gmodnil —g/nif, the 
Cartan-Killing metric tensor on q’, and the 
decomposition q’=q! + 9^. + ag. Then a — qj is 
abelian and invariant in g’. In fact, a is the largest 
abelian invariant subalgebra in g. 


The algorithm stops here, for the algebra 
q" —a'moda-—a'/a—q' +g, has no singular sub- 
space under its Cartan-Killing metric. 

Under this algorithm, the decomposition of q into 
its semisimple part and its maximal solvable 
invariant subalgebra is 

g = (ag. +84) ^ (G9 ^ a0) 
The maximum solvable invariant subalgebra solv 
in à is the semidirect sum of a and nil: solv = gh ^ 
qo—a^Amifl. In addition, ss=qmodsolv= 
q/Solb—q' +q/,. The subspace a' is closed 
under commutation and exponentiates into a 
compact subgroup of G’. The subspace g, 


exponentiates to a noncompact coset in G’ that is 
simply connected. 

Every element in a semisimple Lie algebra can be 
expressed as the commutator of two elements in the 
Lie algebra. In this sense, a semisimple algebra 
reproduces itself under commutation. 

To illustrate this algorithm, we tear apart the 
eight-dimensional Lie algebra spanned by the photon 
operators alaj, 1 <i,j<2 and a343, ai, a3, I, 
where the photon operators obey [a;, a; | =e, The 
regular representative of the general linear combi- 
nation X= 5; mia, aj + nalaz + ra’, + las + ôl is 


0 —mj m21 
0 m1» —nm» 
-Ma Mn „+m — M22 0 
mi —mi 0 —Ht11 + 422 
R(X) = 
4 ay 
ala 
a'a) 
aay 
0 ala; 
n l a. 
-—» =F 43 
0 I 


The Cartan-Killing inner product is the trace of the 
square of this matrix: 


2 


(X, X) = tr R(X)* 2 2(mi — mn) 


: 
+ 87142012, + 2n7 


The subspace dà, is spanned by a\ ay + alaz, à, d3, 1, 
leaving the four operators ala; — 442,442, 
alar, ayds to span g. A simple calculation shows 
that g) is spanned by alas. As a result: 


Subspace Spanned by 
j | i 
q, A; a; — A342, ^ (818 + 8,81) 
| 
q J (aja — a5a1) 
(o a,85 
(1o ala, + a}, ap, as, az, l 


The Lie algebra is the direct sum q=sl(2;R)+ 


u(1) + 54. 


Structure of Semisimple Lie Algebras 


The Cartan-Killing metric g,5 is nonsingular on a 
semisimple Lie algebra. The metric and its inverse g^", 
can be used to raise and lower indices. In particular, the 
tensor whose components are c,5, —c^,g,, is third- 
order antisymmetric: Cag, — C544 — C4ag = EBay sses 
Classification of semisimple Lie algebras is equivalent to 
classifying such tensors. 

Another useful way to describe semisimple Lie 
algebras is to search for a canonical structure for the 
commutation relations. A useful canonical form is 
an eigenvalue form 


[X, Y] = AY 


In a basis X; with X —x'X; and Y —yX;, this 
equation reduces to a standard eigenvalue equation 
for the regular representation 


3 Y y (ROP); - 154) Xp — 0 
j k 

Thus, the search for a standard form for the commuta- 
tion relations reduces to a study of the secular equation 


n 


det(R(X) - M) = 5 (—3)""$((X)-0 [1] 


j=0 


The coefficients @;(X) are homogeneous polynomials 
of degree j in the coefficients x' of X —x'X;. 

In order to extract maximum information from 
this secular equation, a generic vector X € q is 
chosen. Such a choice minimizes all degeneracies. 
With a generic choice of X € q, it is useful to define 
the rank, /, of the Lie algebra q as: 


1. the number of functionally independent coeffi- 
cients ó;( X) in the secular equation; 

2. the number of independent roots, o1,05,...,0] 
of the secular equation; 

3. the dimension of the subspace H Cq that 
commutes with X; and 

4. the number of independent (Casimir) operators 
that commute with all X;:C;(X) 2 ó;(x' — Xj): 
[C;CX), X5] = 0. 


For example, for 50(3) or su(2), the secular 
equation for X — x' X; is 


0 X3 -K 
det|| -x3 0 x, | — Al; 
X5 —391 Q 


= (—A)" + (-Js(x) = 0 


where ó»(x) =x7 + xj +x}. The rank is /=1. There 
is one independent coefficient (x) and one 
independent root of this equation, o = \/— 6jx'x! = 


Lie Groups: General Theory 293 


i/x-:x. The only linear operators that commute 
with X are scalar multiples of X. There is one 
independent homogeneous operator that commutes 
with all generators X;, obtained by the substitutions 
x' — L; (for 30(3)) or x! — S; (for 3u(2)): 


CL) = do(x; > Lj) = 12 t- L2 4- L2 


The secular equation [1] is over the field of real 
numbers. This is not an algebraically closed field. 
There is no guarantee that the number of indepen- 
dent functions ó;(x) in the secular equation is equal 
to the number of (real) roots of this equation until 
we extend the field from R to C, which is 
algebraically closed. As a result, the classification 
of semisimple Lie algebras is done over complex 
numbers. After the complex extensions of the simple 
Lie algebras have been classified, their different 
inequivalent real forms can be determined. 


Root Spaces 


When the secular equation for the regular represen- 
tation of a generic element in a Lie algebra is solved, 
the commutation relations can be put into a simple 
and elegant canonical form. This canonical form 
depends on the rank, /, of the Lie algebra, not the 
dimension, 7, of the Lie algebra. This provides a 
very useful simplification, as n ~ P^. 

For this canonical form, the independent roots 
o1(x),oa»2(x),...,oj(x) are gathered into a single 
vector @ with / components. The vectors @= 
.,a;) are called root vectors. The root 
vectors exist in an /-dimensional space on which a 
positive-definite inner product can be defined. The 
root vectors for a rank-/ semisimple Lie algebra q 
span this Euclidean space. The basis vectors of q 
can be identified with the roots in the root space. 

The roots in a root space have the following 
properties: 


(a1,093,.. 


1. A positive-definite metric can be placed on the 
root space. 

The vector 0 is a root. 

The root 0 is /-fold degenerate. 

If œ is a root and c@ is a root, c= +1,0. 

If œ and B are roots, 


SE ad udi n 


is also a root and 2a - B /a -œ is an integer, nı. In 
fact, B' is the root obtained by reflecting £ in the 
hyperplane orthogonal to a. 

6. The set of reflections generated by nonzero roots 
itself forms a group, the Weyl group of the Lie 
algebra. 
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7. The angle between roots œ and f is determined by 


dedo WENT. 
a-ap-B 22  '4'4'4 
The integers 7,75 for noncolinear roots are 
constrained by |n725| < 4. 


. The relative lengths of the roots are determined 
by the angles between them: 


cos^ (a, B) = 0 1 


cos? (0(a, B)) O(a, B) a-a/B-B 
3/4 30°, 150° gt! 

2/4 45°,135° 2H 

1/4 60°, 120° 1 


9. When the roots are normalized so that 


> ao = dy or 5 a.a-l 

oz 0 a0 
the commutation relations can be placed in the 
canonical form presented in the next section. 


V3/2e, + E €5 


—04 —— \3/2e, + E €v Zin + So 


4 


`~ 


It is possible to build up all possible root space 
diagrams using an “Aufbau” construction. We start 
with a rank-1 root space. This consists of three roots 
in R!: æ,0, —a. 

To construct rank-2 root spaces, a new noncolinear 
root f is adjoined to the two nonzero roots. The new 
root and the old roots span R?. The new root can only 
have a limited set of angles with the roots already 
present. The set of roots a, B is completed by reflection 
in hyperplanes orthogonal to all roots present. If any 
pair of roots violates the angle conditions, the result 
is not a root space. [n this way, the rank-2 root 
spaces G2(30°), B2 — C5(45?),A5(60*), and Dx =A; + 
A,(90°) are constructed from A. Proceeding in this 
way, it is possible to construct rank-3 root spaces 
(B3, C3, A3 = D3) from the rank-2 root spaces, the 
rank-4 root spaces from the rank-3 root spaces, and so 
forth. Ultimately, there are four unending chains 
An, Bn, Cn, Dn and five exceptional root spaces 
G2, F4, Eg, E7, Eg. The rank-2 root spaces are shown 
in Figure 3 and the rank-3 root spaces are shown in 


i xo Q 
—04 —05-— 3/261 + 5 6» 7^ 1 NND E 2 
, , O4 + 205 = V3/2e, -5 €5 "Em 1 T &o wa = e1 + e? 
—204 — 3o» L—— v36} 204 + 305 =— N3e, 
-a —2a =- N3/2e,— 1e E —056 O4 
1 2 ' 2 ` a4 + Ap = N3/2e, — 1 e; +e, 
nf —Ol =-€; = €z O4 = €4 = €v 
3 —0» 
—04 — 3a» =— V3/2e, -3 €5 O4 = V3/2e, — 5 €5 Te, £ €» te, 
1 1 
3 1 
G,-€« —e» 5-e e "A 7 
a a 04 a 
|e1|2 N1712 i 98 2 
Q5 = 26, 
Q5-— 6G 
—04 — —61 t €5 2 2 O4 + 205 = €4 + €5 
—04 — —64 t 65 O4 + Op = CY + @5 
-04 -a2 =—€ Dao e - 
1 2 1 04 + 09 = 64 O4 2 1 204 tao 26, 
fim =—A,— —Q4—-Qo = —06,.,— 0 = - 
Oy —2052—0,— 6, 042 6, * 6 1^ 92 1" &3 04264— 6, 
-02 — —6€5 


Figure 3 HRank-2 root spaces: Ge 30°, Bb = Co 45°, A2 60°, Do = A, 


—O» ES —2 €^ 


te,t65; -20,; +26, 


lei = 41712 


+ A; 90°. 
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AN 


—e,te, 


Figure 4 Rank-3 root spaces: As, B5, C3, D3 = As. 


Figure 4. The normalization factors (cf. point (9) above) 
are shown for the rank-2 root spaces in Figure 3. 


Canonical Commutation Relations 


The canonical commutation relations are expressed 
in terms of root vectors. The / operators in q with 
the /-fold degenerate root vector 0 are H1, H»,..., 
H;. These | operators mutually commute. In a 
matrix Lie algebra, they can be taken as simulta- 
neously commuting diagonal matrices. Associated 
with each nonzero root œ Æ 0, there is exactly one 
basis vector, Eg, in q. The canonical commutation 
relations are expressed in terms of the roots as 
follows: 


[H;, Hj] =0 1<ij<l 
a, Eal = UE 
[EE ul —0 .H 


NggEg,g «-- D a root 


|Ea, Eg] = l 


The structure constants Ngg are determined from a 
recursion relation derived from a chain of roots 
B-m a, B — (m — l)g,...,,8 + (n — 1)æ, B + na, 


0 a +B not a root 


where B — (m + 1)a and B + (n + 1) @ are not roots 
(cf. Figure 5). The structure constants are 


Nop — in(14- m)(a - a) 


The operators H and Eg are often called diagonal 
and shift operators, respectively. They are general- 
izations of the shift operators J3 and J+ of angular 
momentum theory. The general idea is as follows. 
Since the operators H; mutually commute, the 
matrices l'(H;) representing these operators can be 
chosen as diagonal in any matrix representation. 


Figure 5 Ana chain containing 5. 
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The action of any of these operators on a basis 
vector in this representation is H;|m) = m;|m). The 
operator Eg shifts the eigenvalue of H according to 


H(Eg|m)) = (|H, Ea] + ESH)|m) = (œ + m)(Eq|m)) 


In this sense the operators Eg act on basis vectors 
|m) in such a way that the eigenvalue m is shifted by 
Q O mt. 

For the simple classical Lie algebras, the roots can 
be expressed in terms of an orthogonal Euclidean 
basis set as shown in Table 1 and Figures 3 and 4 
for the rank-2 and rank-3 root spaces. The roots for 
the five remaining inequivalent simple Lie algebras 
(“exceptional” algebras) are shown in Table 2. 

The diagonal and shift operators for several of the 
classical Lie algebras can be related to bilinear 
products of boson or fermion creation and annihila- 
tion operators. For u(n), the bilinear products ala; 
are related to Eg with œ =e; — ej 1 < i £j € n, and 
H;=a'a;. This holds for either boson or fermion 
operators. For 5p(2m; R), we have the identifications 
with bilinear products of boson operators as 
follows: --e;--ej — bjb, +e;—ej | b!bj, —ei — 
e; ++ bjb;, and H; = bib;. In particular, +2e; — bi^ 
and —2e; — b?. For $b(25), we have the identifica- 
tions with bilinear products of fermion operators as 


antisymmetric, of USp(2m) that are symmetric, and 
of SO(2n) that are antisymmetric (bosons — sym- 
metric, fermions — antisymmetric). 


Dynkin Diagrams 


Every root in a rank-/ root space can be represented as 
a linear combination of / “basis roots.” These basis 
roots can be chosen in such a way that all coefficients 
are integers. In fact, the basis roots can be chosen so 
that all linear combinations that are roots involve only 
positive integers (and zero) or only negative integers 
and zero. This comes about because every shift 
operator Es can be written as a multiple commutator 


Es a Ha: |Eg, Epl |. 6=a+B+y7 


One simple way to construct such a basis set of 
fundamental roots is to construct an (/ — 1)-dimen- 
sional plane through the origin of the root space that 
contains no nonzero roots, and choose as / funda- 
mental roots the / roots on one side of this 
hyperplane that are closest to it. For the classical 
simple Lie algebras, the fundamental roots are: 


Fel n Hoot Space Qt Qo Qt 4 Qt 
follows: +e; +e; f/f; , +e — ej fili -ei ej — Ec PEN ER 
fif, and Hj = fifi. In particular, HA =f? =0. These Am e; -e2 @2-@3 @1-e€] 
identifications make it a relatively simple matter to — P! aoe WTS HTI SATE 

‘ ; À B, e, — @o 65 — 63 C14 — 8; Hej; 
construct unitary matrıx representations of the D | 
: | e, — e» e» — es ej.1 — eI +2e; 
compact Lie groups SU(m) that are symmetric or 
Table 1 Roots for the simple classical Lie groups and algebras 
Group Algebra Root space Rank Roots Conditions 
SU(/) su(/) A, 4 | — 1 Te; —6j ENES EA 
SO(2/) 30(2/) D, l te; ie; ESETET, 
SO(2/ + 1) &p(21 + 1) B, I +e; + €j, te, 1<i<j,k<l 
Sp(/) = USp(2/) sp(/) = usp(2/) C, I +e; + €j, +2e; Sis jkZi 
Table 2 Roots for the simple exceptional Lie algebras 
Root space Rank Dimension Roots Conditions 
Go 2 14 T8j— 8, 1eig$je&éks«3 
+|(e; + e;) — 2e, | 
F4 4 52 te, + e;, +2e; ISIJE 
+@; +@o t+ @3 + Cy 
Ec 6 78 Le; e; i E TESET- 
1( +61 eo +e; +e, + es) +e, a 
E; 7 133 te; + e; E 1</<j=<6 
1( +e; +e. + @3 +e, + 5 + es) + YF e7 b 
Eg 8 248 +e; e; 1€1«]€8 
1( +e; + e2 + e3 +€, + €s + es + 7 + eg) a 


“Even number of + signs. 
’Even number of + signs within bracket. 


All roots in the rank-2 root spaces have been 
expressed in terms of both two orthogonal vectors 
and two fundamental roots in Figure 3. 

If à; and a; are fundamental roots, their inner 
product is zero or negative 


cos (aj, æj) = 0, — Vi 7 A " 'E 


This information has been used to classify the root 
spaces of the inequivalent simple Lie algebras (over 
C). The procedure is as follows. Each of the / 
fundamental roots in a rank-/ root space is repre- 
sented by a dot in a plane. Dots representing roots 
a; and «; are connected by n; lines, where 
cos (Œi, 0;) = —4/nj/4. Orthogonal roots are not 
connected by any lines. Such diagrams are called 
Dynkin diagrams. Disconnected Dynkin diagrams 
describe semisimple Lie algebras. Connected Dynkin 
diagrams classify simple Lie algebras. 

The properties of Dynkin diagrams arise from two 
simple observations: 


O1: The root space is positive definite. 
O2: If u is a unit vector and v; are an orthonormal 


set of vectors, 
2 
1 (u-v) <1 


These two observations lead to three important 
properties of Dynkin diagrams. 


D1: There are no loops. If œ; (i=1,2,...,k) are in a 
loop, then there are at least as many lines as 
vertices. With u; =a;/|\a@;\, 


k k k 
37937 =k+2 uu; > 0 


i=] j=1 i<j 


Since 2u;- u; < —1 if uj: u; #0, there cannot be 
as many lines as vertices. 

D2: The number of lines connected to any node is 
<4. If a; are connected to v, then with 
u; = ari |a, 


Sv: u;) = 3 n;/4 « 1 


since v is linearly independent of the a. 

D3: A simple chain connecting any two nodes can be 
shrunk. If the original diagram is allowed, the 
shrunk diagram is also allowed, and conversely. 
Since the shrunk diagram in Figure 6 violates D2, 
the original is not an allowed Dynkin diagram. 


According to these results, the maximum number 
of lines that can be attached to a vertex is three. If a 
vertex is attached to three lines, it can be connected 
to three (one line each) other vertices, two (two plus 
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Shrink 
———-— 


Figure 6 A chain with single links can be removed from a 
diagram. If the original is an allowed Dynkin diagram, the shrunk 
diagram is also allowed, and conversely. 


one) other vertices, or only one other vertex (all 
three lines). This last case describes Dynkin diagram 
G5 (cf. Figures 3 and 5). 
The only remaining possibilities are shown in 
Figure 7. 
For diagrams of type (B, C, F) we define vectors 
q 


[ 
X v=) jvj 


i=] j=l 


where as usual u; v; are unit vectors @,/|a,|. The 
Schwartz inequality applied to u and v leads to the 


inequality 
l 
(1 +) (1 D +z 
p q 


The solutions with p > q are 


p q Hoot space Constraint 
arbitrary 1 Bj, C, i=p+1 
2 2 F4 


For diagrams of type (D, E), we define vectors 


where as usual w;,v; 1, are unit vectors Q,,/ 0t. 
With similar arguments, we obtain the inequality 


Lh. d. 
— HTIL—-»9-- 
P q F 
€—— ——e€— ——e————«4 — —*— — — e—————9 (B, C. F) 
Us Up Va Va 
WA 
W,—1 
(D, E) 
LA Up-1 X Vq-1 V4 


Figure 7 The only remaining candidate Dynkin diagrams have 
either two vertices (B, C, F) or one vertex (D, E) connected to 
three lines. 
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The solutions with p > q > r are 


p q r  Hootspace Regular Euclidean solid 
arbitrary 2 2 Dp)+2 

3 3 2 Eg Tetrahedron 

4 a 28 E Cube-octahedron 

5 3 2 Es Icosahedron-dodecahedron 


All allowed Dynkin diagrams are shown in Figure 8. 
In these diagrams roots making an angle of 120* 
with each other (joined by single lines) have equal 
length. Roots joined by double lines or triple lines 
have different lengths. The arrows on double lines 


pe e «(ie — - £8 -——— A, 
en &p Q— 1 Gy 
Q4 —1 
D, 
O4 Op Qu —2 
Qj 
B, 
O4 Qo Oy _-4 Oy 
C, 
O4 Op Oy _4 Cy 
ED Go 
O4 Qo 
0————da-————p———o F4 
O4 Op Os Q4 
| O6 
Eg 
O4 Op O3 4 Os 
| O7 
E; 
O4 Qo C Y3 Oa C Ys O16 
Og 
Eg 
OY Qo Q3 QA O5 Os Q7 


Figure 8 Four infinite series (Aj, Dj, Bj, Cj) of Dynkin diagrams 
exist and correspond to the classical simple Lie groups (SU 
(I+ 1), SO(2/), SO(2/ + 1), USp(2/)). The five exceptional Dynkin 
diagrams include a short finite series (Ej, / — 6, 7,8), F4, and Go. 


indicated the shorter and longer roots. Arrows point to 
longer roots. The root space G2 and F4 are self-dual, so 
it does not matter which way the arrow points. 

Coxeter-Dynkin diagrams also appear in classical 
geometry and catastrophe theory. 


Real Forms 


The metric tensor g, for a simple Lie algebra (over C) 
in the canonical basis H, Eg is 


In this basis, the Lie algebra decomposes into 
positive- and negative-definite subspaces according to 


8 = 84 T8- 


Hi, (Erat E-a)/V2 
(Eia — E-a) V2 


a, spanned by 
q_ spanned by 


The choice of basis suggested above diagonalizes the 
Cartan-Killing form in eqn [2]: g — Ij5, with 
p=l+ (1/2)(n —l) positive values +1 on the diag- 
onal and g=(1/2)(m — I) values —1 on the diagonal. 
The trace of this matrix is the trace of g: +. 

An arbitrary element in this (complex) Lie algebra 
is a linear superposition of the form 


X=5 PH. Ey [3] 


o 7-0 


where all n coefficients 5',e^ are complex. If all 
these coefficients are taken real, the resulting Lie 
algebra closes under commutation and describes a 
noncompact Lie group. The subalgebra describing 
the maximal compact subgroup is spanned by the 


linear combinations (E; — E &)/v2. The remain- 
ing operators exponentiate to a noncompact coset 


EXP{ h'H; + e (Eia + E. a)/ V2] 


which is topologically equivalent to RÁ,K—14- 
(1/2)(n — I) 2 (1/2)(n +1). Of all the real forms of 
the complex Lie algebra described by this set of 
canonical commutation relations (or root space, or 
Dynkin diagram), this is the least compact real form. 

The compact real form is obtained from [3] by 
taking linear combinations 


X=)  ib'H; M, iet (Es +E-a)/V2 
i oz 0 
E ` e* (Eza = E_g)/V2 
a#0 


where h',e%,e® are real. The compact real forms of 
the simple Lie algebras are: 


Hoot space Group 

A, — 1 SU(/) 

D, SO(2/) 

B, SO(2/ + 1) 

C, USp(2/) = Sp(/) 


If the imaginary factor i is absorbed into the 
Cartan-Killing metric, this metric is diagonal, all 
matrix elements are —1, the trace of this form is —n, 
and the linear combinations for X are real. 

Every complex simple Lie algebra (i.e., simple Lie 
algebra over C) has a spectrum of inequivalent real 
forms. These can all be obtained from the compact 
real form by an analog of Minkowski’s “rotation 
trick,” derived by Cartan. Cartan introduced a 
metric-preserving linear mapping (“involutive auto- 
morphism”) T:g — g with the property T? =I and 
(TX, TY) =(X,Y), with X,Y €g. The operator T 
has eigenvalues +1 and induces a decomposition 
(“Cartan decomposition") in q as follows: 


T(q) = TŒ + T(p) 
q=—f+p { | 
f — p 


As a result, the subspaces f and p are orthogonal. 
The subspaces obey the following commutation and 
inner-product properties: 


if, f] cf, (f, f) <0 
|, p] Cp, (f,p) =0 
[p.p] € f, (p,p) <0 
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Under the analytic continuation p — ip, the com- 
pact Lie algebra q is rotated to a noncompact Lie 
algebra a^ whose commutation relations and inner- 
product properties are 


g=f+p = gattt 
if, f] c f, (f, f) <0 
PU] EE, (£p) =0 
Ip. p] €f, (59, p) > 0 


The maximal compact subalgebra of g’ is f. The 
subspace p' exponentiates to a simply connected 
submanifold on which the Cartan-Killing metric is 
positive definite. This manifold is topologically 
equivalent to R^, K — dim p. It is not geometrically 
equivalent to R^ once an invariant metric is placed 
on it. 

Three linear mappings that satisfy T? =I suffice 
to generate all real forms of all the simple classical 
Lie algebras. 


Block Matrix Decomposition 


The compact Lie algebra 1(z;F) has a block 
submatrix decomposition (n — p + q): 


a fA D 0 +B 
D= al Lo | 


where Al = — Ap, Al = —A, and B is an arbitrary 
p x q matrix over F. Under the map 


I, O0 
T(q) = 15,4815. Ipa = b E. | 
q 
the diagonal subspace 
Ap 0 
0 A, 
has eigenvalue +1 and the off-diagonal subspace 
0 +B 
—B! 0 


has eigenvalue —1. Under the Cartan rotation 


Ap 0 0 +B 
iir; F) > u(p.q;F) = [d ESI E 
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The real forms of the classical Lie groups obtained 
in this way are 


Dy, B; 
SO(2n) 


SO(2n + 1) 


Sp(7) — Sp(p, q) 
USp(2n) — USp(2p, 24) 


Subfield Restriction 


The Lie algebra $u(z) of complex traceless anti- 
Hermitian matrices has a subalgebra $o(7) of real 
antisymmetric matrices. The algebra su(m) can be 
expressed in terms of real nxn antisymmetric 
matrices A, and traceless symmetric matrices S,,: 


&u(n) = So(n) + [8u(n) — $0(n)] = A, + iS, 
The Cartan rotation is 


su(n) — el(m; R) = $o(n) + i[sit(m) — so(n)| 
B T Ss 


The classical Lie group generated by this transfor- 
mation is SL(z; R). 

A similar rotation can be carried out on unitary 
matrices over the quaternion field, i1(75; Q) — p(n). 
This algebra contains the subalgebra iu(z) in which 
quaternions q = qo + Zq1 + Jq2 + Kqs are restricted 
to complex numbers q = qo + iq1. There is a natural 
decomposition 


3p(n) = u(n) + [8p(n) — ua) 


It is useful at this point to replace each quaternion 
matrix element by a 2 x 2 complex matrix: šp(n) — 
usp(2n). This is a unitary representation of the 
symplectic algebra. Replacing the complex matrix 


Table 3 Real forms of the simple classical Lie algebras 


elements in u(n) by 2 x 2 real matrices simultaneously 
generates a real matrix representation of u(n) named 
ou(27). This is an orthogonal representation of the 
unitary algebra. The decomposition above is 


áp(n) — u(n) + [sp(m) — u(n)| 
— ou(2m) + [usp(2n) — ou(2n)| = A», + 182, 


where as before A2, and $5, are 2m x 2n antisym- 
metric and symmetric matrices. The Cartan rotation 
maps this to sp(2; R), 


usp(2n) — 5p(2n; R) = Arm + S2n 


The classical Lie group generated in this way is 
Sp(27; R). Matrices in this group satisfy the quadratic 
constraint M'GM—G,G'— —G,det(G) Z 0. The real 
symplectic groups leave invariant Hamilton's equations 
of motion: dp;/dt = —OH /0dqi, dq;/dt =+ OH /Op;. 


Field Embeddings 


The image of u(m) — 01(2z) consists of a set of 
2n x 2n antisymmetric matrices of dimension n°. 
These matrices form a subset of 50(25), which 
consists of 2nx2n antisymmetric matrices of 
dimension 2n(2n — 1)/2. As a result, ou(27) is a 
subalgebra in  $0(25). Thus, ou(2m)~f and 
5D(2n) ~ q and we have a Cartan decomposition 


$0(2n) = ou(2n) + [s0(2n) — ou(2n)| 
| | 
ou(2#) + i[$0(2”) — ou(2m)| = 90° (2n) 
In the same way, the image of sp(2m) — usp(2n) 
consists of an m(2m + 1)-dimensional set of 2» x 2n 
anti-Hermitian matrices. This is a subset of 51(271), 
which has dimension (25)! — 1. It is also a sub- 
algebra of $11(27). Thus, usp(27) ~ f and su(2n) ~ Q, 
so we have a Cartan decomposition 
Su(2n) =usp(2n) + [8u(2n) — usp(2n)| 
l l 
usp(2n) + ifsu(27) — usp(2n)] = si (2n) 


These real forms are summarized in Table 3. 


Mapping Real form Maximal compact subalgebra Root space Condition 
Block submatrix 50(p, q) 50(p) + so(q) D, p+q=2n 
so(p, q) so(p) + so(q) B, p+q=2n+1 
eu(p, q) u(1) + su(p) + eu(q) An 4 p+qe=n 
5p(p, q) = usp(2p, 2q) usp(2p) + usp(2p) C, p+q=n 
Subfield restriction sltn; R) soln) An. 4 
5p(2n; R) u(n) Cn 
Field embedding 5p' (2n) u(n) D, 
au*(2n) 5p(n) = usp(2n) Aon. 1 


Table 4 Equivalence among real forms of the simple classical 
Lie algebras 


A, = B = iG X 
511(2) = $0(3) = sp(1)=usp(2) —3 
5u(1,1)=sl(2;R) = sv(2, 1) = $p(2; R) 41 
D» — A + A; X 
sp(4) = $0(3) + 50(3) -6 
so” (4) = $0(3) + $0(2, 1) —2 
50(3, 1) = A C) 0 
$0(2, 2) = $0(2, 1) + $0(2, 1) +2 
Bp = C X 
30(5) = $p(2)=usp(4) —10 
s0(4, 1) = $p(1, 1)=usp(2, 2) —2 
$0(3, 2) = sp(4; R) +2 
Da = Áa X 
50(6) = $11(4) -15 
50(5, 1) = 511*(4) —§ 
50° (6) = $13, 1) —3 
50(4, 2) = $1(2, 2) +1 
50(3, 3) = $1(4; R) +3 

The root spaces  Aj[SU(2)], B1[SO(3)], and 


C4[U(1; QO) ~ USp(2; C)] are equivalent. As a result, 
the different real forms of their complex extensions 
are related to each other. Similar remarks hold for 
the real forms of B,—C5,D5—A,4-4-A,, and 
D3 = As. The relations among these real forms are 
summarized in Table 4. This table is useful in 
inferring “spinor representations" among classical 
groups. Thus, SO(3) has spinor representations 
based on SU(2) and Sp(1); SO(4) has spinor 
representations based on SU(2) x SU(2);SO(5) has 
spinor representations based on USp(4); and SO(6) 
has spinor representations based on SU(4). 

For completeness, the real forms for the excep- 
tional Lie algebras are collected in Table 5. 

Real forms of the complex extension of a simple 
Lie algebra are almost uniquely distinguished by an 
index. This is the trace of the Cartan-Killing form 
[2], once the appropriate factors of i have been 
absorbed into it. If ne is the dimension of the 
maximal compact subgroup, x — tr(g) — -- 1(z — ne) 
—1(n.) =n — 2n.. The index ranges from —n for the 
compact real form (for which z =n) to + for the 
least compact real form. 


Riemannian Symmetric Spaces 


Exponentiation lifts Lie algebras to Lie groups and 
subspaces in Lie algebras into submanifolds in Lie 
groups. In particular, exponentiation of a Cartan 
decomposition 
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Table 5 Real forms of the exceptional Lie algebras 


Maximal compact 


subgroup 
Hoot space ClaSSgpank(Character) Hoot space Dimension 
Go Go, 4 4) Go 14 
G»(.2) A; +A; 6 
F4 Fai -52) F4 52 
Fa(-20) B4 36 
F; 4(44) C3 + A 24 
Es Ee 78) Eg 78 
Ee(. 26) F4 52 
E6(-14) Ds + Dy 46 
Eg(.2) As + A 38 
Ee.6) Ca 36 
E; E7(.433) E; 133 
E725 Eg T D; 79 
Ez(. 5; Dg + A 69 
Ez(47) A; 63 
Eg Eg(. 248) Eg 248 
Ea. 24) E; +A; 136 
Eg(.8) Ds 120 
a = I + p 
i l l 
G = K x (P=G/K) 


lifts the subspace p to the quotient (P — G/K). 

A metric may be defined on the Lie group G as 
follows. Define the distance between the identity 
and some nearby point  g(c)— EXP(eX) = 
EXP(őx'X;) by 


ds? (0) = G, ôx" x5 


Move I and g(e) to the neighborhood of any point 
g(x)ceG by left multiplication: gí(x)/ — g(x), 
g(x)g(óx'X;) — g((x + dx) X;). The  infinitesimals 
dx'(x) at x (defined by g(x)) and óx'—dx'(0) at I 


are linearly related, 
bx’ = M';(x) dx! (x) 


By requiring that the distance ds between I and 
g(óx'X;) at the identity be the same as the 
distance between g(x'X;J and g(x'X;)g(óx'X;)— 
g((x + dx) X;) at g(x'X;) leads to the condition 


Grs( 
= G, (0)M';(x)M5;(x) do! (x) dx! (x) 
G;;(x) dx'(x) dx! (x) 


An invariant metric G(x) over the Lie group G is 


defined by 


Gi(x) "A G4(0)M";(x)M^;(x) 
G(x) = M'(x)G(0)M(x) 
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It is useful to identify G(0) with the Cartan-Killing 
inner product on q. Since M(x) is nonsingular, the 
signature of G(x) is invariant over the group. 

The invariant metric on G can be restricted to 
subspaces K C G and P — G/K CG. The signature 
on these subspaces is the same as the signature on 
the subspaces f and p in q. Thus, if G is compact, 
the invariant metric is negative definite on K and on 
P=G/K and positive definite on the analytically 
continued space P'— G'/K. In short, it is definite 
(negative, positive) on P,P'. These spaces are 
Riemannian spaces and they are globally symmetric. 
They have been investigated by studying the proper- 
ties of the secular equation of the Lie algebra Q, 
restricted to the subspace p: 


det [R(p'P;) — AI] = X '(-A)""ój(p) =0 [4] 
/ 

where the P; are basis vectors that span p. The 
coefficients j(p) in the secular equation [4] for 
Riemannian symmetric spaces are related to the 
coefficients @;(x) in the secular equation [1] for Lie 
algebras. A rank for the Riemannian symmetric 
space P=EXP(p) can be defined from the secular 
equation following exactly the prescription followed 
for the Lie algebra q. The rank of the Riemannian 
symmetric space P = EXP(p) is 


1. the number of functionally independent coeffi- 
cients ó;(p) in the secular equation; 

2. the number of independent roots of the secular 
equation; 

3. the dimension of the maximal Euclidean sub- 
space in P; and 

4. the number of independent (Laplace-Beltrami) 
operators that commute with all displacement 
operators P;: A;(P) = ój(p' — Pj). 


Rank-1 Riemannian symmetric spaces are isotropic 
as well as homogeneous. 


exceptional Riemannian symmetric spaces can be 
constructed from the information in Table 5 following 
the procedure used to construct Table 6 from Table 3. 

As particular examples of Riemannian symmetric 
spaces we consider the compact spaces SO(p + q)/ 
[SO(p) x SO(q)] and their noncompact counterparts 
SO(p, q)/[SO(p) x SO(q)]. These spaces have rank 
min(p,q), dimension pq, and can be represented 
explicitly in matrix form as 


0 X 0 X 
— EXP 
c X! 0 o X! 0 
|| D, Y 
[ex | D 


Here X is a pxq matrix and o=+1 for the 
noncompact case and —1 for the compact case. The 
block diagonal matrices Dp and D, are defined from 
the metric-preserving conditions (M'Ij, M = Ip.5, 
M'I, gM = Ip, 4) 


Di-h-4eoYY, — Do-lgY'Y 

The pq coordinates in the Riemannian symmetric 
spaces can be taken as the pq elements of the 
submatrix Y. 

These Riemannian symmetric spaces can be 
treated as algebraic submanifolds in RF, K 2 pq + 
(1/2)q(q-- 1). The K coordinates on R^ can be 
identified with the pg matrix elements of Y and the 
(1/2)q(q + 1) matrix elements of the real symmetric 
matrix D,. These coordinates obey the (1/2)q(q + 1) 
algebraic constraints defined by 


D; — oY'Y — 1, 
For SO(3)/SO(2) and SO(2,1)/SO(2), this condition 


is determined from the matrix 


n+(ox S] | 3 


Tables 3 and 5 contain all the information required to be 
to enumerate all the classical and exceptional Rieman- Hes UE s 
nian symmetric spaces. All the classical Riemannian 
symmetric spaces are tabulated in Table 6. The z -—o(x^4 3^) 21 
Table 6 All classical Riemannian symmetric spaces 
Root space Quotient Dimension Hank X 
Apiga SU(p, q)/S[U(p) « U(q)] 2pq min(p, q) 1 - (p — qY 
Án: SL(n; R)/SO(n) ! (n 4- 2)(n — 1) n—1 n—1 
Aon. 4 SU'(2n)/USp(2n) (2n 4- 1)(n — 1) n — 1 —2n — 1 
Bos. SO(p, q)/SO(p) 2 SO(q) pq min(p, q) pq —ip(p —1) — àa(q - 1) 
Dog SO(p, q)/SO(p) $ SO(q) pq min(p, q) pq —3p(p —1) - àa(q — 1) 
n SO*(2n)/U(n) n(n — 1) n/2 -n 
Cp+q USp(2p, 2q)/USp(2p) © USp(2q) 4pq min(p, q) -2(p — q} — (p +q) 
n Sp(2n; R)/U(n) n(n +1) n +n 


For g = — 1, the space is the sphere $? defined by z* + 
(x^ + y?) « 1. Foro = +1, the space is the two-sheeted 
hyperboloid H defined by z* — (x? + y?) 2 1. More 
specifically, it is the upper sheet containing (0, 0, 1) of 
the two-sheeted hyperboloid. The second sheet occurs 
in the coset O(2,1)/SO(2). The symmetric spaces 
SO(n + 1)/SO(») and SO(n, 1)/SO(z) are the sphere 
S” and the upper sheet of the two-sheeted hyperboloid 
H5,. Both have dimension n and rank 1. The spaces 
are simply connected, homogeneous, and isotropic. 

For SO(4, 2)/SO(4) x SO(2), the eight-dimensional 
algebraic manifold is defined by the three con- 
straints in R!!: 


71 y 
fs «dm y2 y E y2 6 
yio Jti YS Ye Vr RII” Y? 
y4 Y8 
[1 0 
oi 
The compact analytically continued space 


SO(6)/SO(4) x SO(2) is obtained by setting o — — 1. 
These spaces have dimension 8 and rank 2. They are 
homogeneous but not isotropic. For each, there are 
"two inequivalent directions." There are two inde- 
pendent Laplace-Beltrami operators on these spaces, 
one quadratic and one quartic. 

The complete list of globally symmetric pseudo- 
Riemannian symmetric spaces can be constructed 
almost as easily. Two linear operators, T; and T», 
are introduced that obey T?=1, T} —I, Tj T; = 
TT; #1. The two are used to split q into 
subspaces 

Tig, — 00,  T2Ger=TQor 
where o= +1,7= +1. The decomposition and 
double rotation ! 


B =O Poy POPE 
IT: 


g = C +O, FG, d.) 
ITa 


Q° = 944 ig, +i(g_,+ig__) 


generates a noncompact subgroup K" as well as a 
pseudo-Riemannian symmetric space P”: 
K” = EXP(a... +ig4-), P" = EXP(ig , +g) 


These have also been classified. 
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The simplest example of a pseudo-Riemannian 
symmetric space is SO(2,1)/SO(1,1): 


The metric-preserving condition M'I5 ,;M=I) | 
leads to the constraint equation z? + x? — y? — 1. 
This space is the single-sheeted hyperboloid H?. It is 
two dimensional and has rank 1, but it is not 
isotropic. Intersections with the plane x=0 are 
hyperbolas and with the planes y — const. are circles. 
This space is not simply connected. 


Summary 


Lie groups are among the most powerful mathema- 
tical tools available to physicists. They play a major 
role in physics because they occur as transformation 
groups from coordinate system to coordinate system 
in real space (rotation group SO(3), Lorentz group 
O(3,1), Galilei group, Poincaré group JSO(3,1)) or 
in spaces describing internal degrees of freedom 
(SU(2) for spin or isospin, SU(3) for quarks and 
color, SU(4) for spin-isospin, etc.). 

It is remarkable that a beautiful classification 
theory for simple (the building blocks) Lie groups 
exists, because of the rather amorphous nature of the 
definition of a Lie group. In a search for structure, 
the first step in the analysis of Lie groups is 
linearization of the group multiplication law in the 
neighborhood of the identity to a linear vector space 
on which there is a Lie algebra structure. This in itself 
is sufficient to create a strong connection to quantum 
mechanics. Although there is not a 1:1 correspon- 
dence between Lie groups and their Lie algebras, 
there is a very beautiful connection between them. 
This relates algebra (discrete invariant subgroups) 
and topology (homotopy groups) in an elegant way. 

The structure of Lie algebras is described using 
tools from linear algebra: secular equations and 
inner products. Together, these tools are used to 
reduce Lie algebras to their basic units: nilpotent 
and solvable invariant subalgebras, and semisimple 
and simple Lie algebras. The commutation relations 
for simple Lie algebras can be put into a canonical 
form using another miracle of this theory: a positive- 
definite root space that summarizes the properties of 
the secular equation and the Cartan-Killing inner 
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product. As the secular equation can only be solved 
exactly over an algebraically closed field, the 
classification of simple Lie algebras covers complex 
Lie algebras. Each complex extension has several 
real forms, which are easily classified. 

Even more remarkable is the connection between 
simple Lie groups and Riemannian spaces that *look 
the same everywhere." All Riemannian symmetric 
spaces are quotients of a simple Lie group by a 
subgroup that is maximal in some precise sense 
(Cartan decomposition sense). Cartan was able to 
classify all Riemannian symmetric spaces as a 
consequence of his classification of all the real 
forms of all the simple Lie groups. The algebraic 
tools used to classify Lie algebras (secular equations, 
Dynkin diagrams) were used again to classify these 
spaces (Dynkin diagrams — Araki-Satake diagrams). 
These spaces are classified by a root space, group- 
subgroup pair, dimension, rank, and character. 
Construction of invariant operators (Casimir invar- 
iants, Laplace-Beltrami operators) is algorithmic. 

Nonsemisimple Lie groups/algebras can be con- 
structed from simple Lie algebras by carefully 
introducing singular change of basis transforma- 
tions. This leads to "group contraction," not 
discussed above. In this way, the Poincaré group 
can be constructed systematically from the groups 
SO(3, 2) or SO(4, 1): SO(3,2) — ISO(3, 1), SO(4, 1) > 
ISO(3,1) in the limit of “large R.” Here, R is the 
“radius” of some universe of hyperbolic nature, with 
signature (3,2) or (4,1). The Galilei group can be 
constructed by contraction from the Poincaré group in 
the limit c=3 x 10? cms"! — oc. 

We have not discussed here the theory of the 
representations of Lie groups. A beautiful theorem by 
Wigner and Stone guarantees that the tensor represen- 
tations of a compact group are complete. Gel'fand has 
given expressions for the complete set of tensor 
representations of the classical compact Lie groups. 
They are expressed by "dressing" the appropriate 
Dynkin diagrams or else in terms of irreducible 
representations of the symmetric group $,. Gelfand 
has also given explicit, analytic, closed-form expres- 
sions for the matrix elements of any of the shift 
operators in any of these representations. For the 
noncompact real forms, most of the unitary irreducible 
representations can be obtained from these expressions 
for matrix elements (*master analytic representation") 
by appropriate analytic continuation. 


Since Lie groups exist at the interface of algebra 
and topology, it is to be expected that there is a very 
close relation with the theory of special functions. In 
fact, the theory of special functions forms an 
important chapter in the theory of Lie groups. On 
the topological side, the shift operators Eg (think J+) 
have coordinate representations (x'|Eg|x) involving 
first-order differential operators. On the algebraic 
side, the matrix elements (n'|Eg|n) are square roots 
of products of integers (divided by products of 
integers). These topological and algebraic expres- 
sions are related to each other in a myriad of ways. 
All of the standard properties of special functions 
(Rodriguez formulas, recursion relations in coordi- 
nates and indices, differential equations, generating 
functions, etc.) occur in a systematic way in a Lie- 
theoretic formulation of this subject. 

Finally, no review or even book could do justice 
to the applications that Lie group theory finds in 
physics. 

The rich interplay that exists between freedom 
and rigidity of structure found in Lie group theory 
can be found in only the purest works of art — for 
example, the fugues of Bach. 


See also: Classical Groups and Homogeneous Spaces; 
Compact Groups and their Representations; Cosmology: 
Mathematical Aspects; Equivariant Cohomology and the 
Cartan Model; Finite-Type Invariants of 3-Manifolds; 
Functional Equations and Integrable Systems; Lie 
Superalgebras and Their Representations; Lie, 
Symplectic, and Poisson Groupoids and Their Lie 
Algebroids; Measure on Loop Spaces; Quasiperiodic 
Systems; Symmetry and Symplectic Reduction; 
Symmetry Classes in Random Matrix Theory; Toda 
Lattices. 
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Basic Definitions 


Let .A be an algebra over a field K of characteristic 
zero (usually K — R or C) with internal laws + and +. 
One sets Z5 —7,/27,— (0,1). A is called a super- 
algebra or Z»-graded algebra if it can be written into 
a direct sum of two spaces A = Az @ Aj, such that 


As * Ag C Aj; As * Az C As, Az * Ax C As 


Elements of A> are called even or of degree 0 while 
elements of A; are called odd or of degree 1. 
A superalgebra A is called associative if (X + Y) « 
Z=X*(¥*Z) for all X,Y,Zc.A. It is called 
commutative if X» Y —(—1)4e&XdeeYy 4X for all 
X, Y € A, where deg X is the degree of the element X. 

A homomorphism 4 from a superalgebra .A into a 
superalgebra A’ is a linear application from A into 
A’ which respects the Z»-gradation, that is, &(A5) C 
Az and ®(A;) C A$. 

A Lie superalgebra G over a field K of character- 
istic zero (usually K — R or C) is a superalgebra in 
which the product, denoted |,]|, satisfies the 
following properties: 


Z-gradation 
Gi, G; | C TY 


Graded-antisymmetry 


(i,j € Za) 


Generalized Jacobi identity 


- 1 ydegXo-degoxr. Xi, |X;, Xi] 
" (— 1 aai rx | Xz, X; || 
(71) e hi, [Xe XT] = 0 


Note that G; is a Lie algebra, called the even or 
bosonic part of G, while G;, called the odd or 
fermionic part of C, is not an algebra. 

An associative superalgebra G — G; ® G7 over the 
field K acquires the structure of a Lie superalgebra by 
taking for the product | , | of two elements X, Y € G 
the Lie superbracket (also called supercommutator or 
graded commutator) 


[X, Y] 2 X » Y — (-1)*8* ^8" y 4 X 


The notation | , | for the supercommutator is used to 
avoid confusion with the usual commutator |X, Y] = 
X*Y—Ysxx. 

A Lie superalgebra G is Z-graded if it can be 
written as a direct sum of finite-dimensional Z»- 
graded subspaces G; such that 


dS as <> Gi, where [GG] C Gis; 


1€ 7, 


The Z-gradation is said to be consistent with the Z2- 
gradation if 


Ga = pJ Gu and G7 = $. O2i41 


Ic, IcZ, 
It follows that Go is a Lie subalgebra and that each 
Gili Æ 0) is a Go-module. 

A subalgebra K = Ky @ K of a Lie superalgebra G 
is a subset of elements of G which forms a vector 
subspace of G that is closed with respect to the Lie 
product of G such that Kj CG, and Ki CG. 
A subalgebra K of G is called a proper subalgebra of 
G if K Æ G. An ideal Z of G is a subalgebra of G such 
that |G,Z]| C T, that is, X € G,Y ez [X,Y] ez. 
An ideal Z of G is called a proper ideal of G if T Z G. 
If Z and Z' are two ideals of G, | Z, Z'| is an ideal of G. 

The definitions of the centralizer, the center, and 
the normalizer of a Lie superalgebra follow those of 
a Lie algebra. Let SS be a subset of elements in the 
Lie superalgebra G. The centralizer Cg(S) is the 
subset of G given by 


Co(S) = (X e g|[X, Y] 2 0, VY e S) 


The center Z(G) of G is the set of elements of G 
which commute with any element of G (in other 
words, it is the centralizer of G in G): 


Z(G) = (X e g|[X, Y] 2 0, VY e 9} 
The normalizer Ng(S) is the subset of G given by 
Nag(S) = (Xeg|[X, Y] €S, VY € S} 


The Lie superalgebra G is said to be nilpotent if 
considering the series |G, gi] — gl! with JU =G, 
then there exists an integer n such that G”! = {0}. 

The Lie superalgebra G is said to be solvable if 
considering the series ig D) gu u =G" with g'?! — g, 
then there exists an integer n such that G"" — (0). A 
Lie superalgebra G is solvable if and only if Gp is 
solvable. 

Let G be a noncommutative Lie superalgebra. 
The Lie superalgebra G is called simple if it does 
not contain any nontrivial ideal. The Lie super- 
algebra G is called semisimple if it does not 
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contain any nontrivial solvable ideal. Let us 
recall that if A is a semisimple Lie algebra, it 
can be written as the direct sum of simple Lie 
algebras A;: A= 4; Ai; This is not the case for 
superalgebras. 

Let — 65 G; be a Lie superalgebra and V= 
V5 ®@Vz be a Zo-graded vector space. Consider 
the algebra EndY of endomorphisms of Y, 
which naturally acquires a superalgebra structure 
by End V= Endz V ® End; V, where End; V={¢ € 
End V|ó(V;) C Viy}. A linear representation m of G 
is a homomorphism of G into End Y, that is, 


z(aX + BY) = an(X) + Br(Y) 
m([X, Y]) = (X), T( Y)] 
7T(G,) C End;V and m(6;)C End;V 


for all X, Y € Ganda, 8 € C. The vector space V is the 
representation space. The vector space Y has the 
structure of a G-module by X(v)—z(X)v for X €G 
and v € V. The dimension (resp. superdimension) of the 
representation 7 is the dimension (resp. graded dimen- 
sion) of the vector space Y: dim z = dim V; + dim V; 
and sdimz = dim V; — dim Vz. In particular, the repre- 
sentation ad: G — End G (G being considered as a 
Z2-graded vector space) such that ad(X)Y — | X, Y| 
is called the adjoint representation of G. 

in the bass (Zis cir Ciit ccs Ema) Of V= 
V5 & Y: (called homogeneous basis), where dim V5 = m 
and dim Y; =n, an element of G is represented by the 


matrix 
A B 
M=(¢ p) 


where A, B, C, and D are m x m,m x n,n x m, and 
nxn matrices, respectively. Even elements corre- 
spond to block diagonal matrices (i.e., B — C — 0), 
odd elements to block antidiagonal matrices (i.e., 
A=D=0). One defines the supertrace function 
denoted by str: 


str(M) — tr(A) — tr(D) 


To a given representation 7 of G, one can associate a 
bilinear form B, on G as 


B,(X,Y)-str(z(X)r(Y), VX,YEG 


7(X) are the matrices of the generators X in the 
representation m and str denotes the supertrace. A 
bilinear form B on G is called 


1. consistent if B(X,Y)=0O for all X € Gz and all 
Y E G- 
2. supersymmetric if, for all X, Y € G, 


3. invariant if, for all X, Y, Z € G, 
B([X, Y], Z) = B(X,[Y.Z]) 


The bilinear form associated to the adjoint repre- 
sentation of G is called the Killing form on 
G: K(X, Y) =str(ad(X)ad(Y)). It is consistent, super- 
symmetric, and invariant. 


Classification of Simple Lie 
Superalgebras 


The simple Lie superalgebras have been classified by 
V G Kac. One distinguishes two general families: the 
classical Lie superalgebras and the Cartan type 
superalgebras. 


Classical Lie Superalgebras 


A simple Lie superalgebra G— 6; 6G; is called 
classical if the representation of the even subalgebra 
Gz on the odd part G; is completely reducible. The 
superalgebra is said to be of type I if the representa- 
tion of G; on G; is the direct sum of two irreducible 
representations of G3. In that case, one has 6; = 


G 41 ®© 6, with 
[61,61] = €; and [921,941] =0 


The superalgebra is said to be of type Il if the 
representation of G; on G; is irreducible. 

A classical Lie superalgebra G is called basic if 
there exists a nondegenerate invariant bilinear form 
on G. The basic Lie superalgebras split into four 
infinite families: A (m, n) or sl(m + 1|n + 1) for m Zn 
and A(m,m) or sl(n + 1|n + 1)/Z — psl(n + 1|n + 1), 
where Z is a one-dimensional center for m=n 
(unitary series), B(m,mn) or osp(2m + 1|21), C(n) or 
osp(2|2n), D(m,n) or osp(2m |2n) (orthosymplectic 
series); and three exceptional superalgebras F(4), 
G(3), and D(2,1;o), the last one being actually a 
one-parameter family of superalgebras. The classical 
Lie superalgebras which are not basic are called 
strange, and correspond to two infinite families 
denoted by P(n) and O(n). 

A basic Lie superalgebra G= 6G; ® G1 admits a 
consistent Z-gradation G= ®;ez G; (called distin- 
guished), such that (see Tables 1 and 2) 


e for superalgebras of type I, g; — 0 for |i| > 1 and 
G5 = 09, 91 — G.1 OG; and 

e for superalgebras of type II, C; — 0 for |i| > 2 and 
Gz — 6-2 ® Go ® 02,61 — 6.1 PG. 

Cartan Type Superalgebras 


The Cartan type Lie superalgebras are the simple Lie 
superalgebras in which the representation of the 
even subalgebra on the odd part is not completely 


Table 1 75-gradation of the classical Lie superalgebras 


Superalgebra G Gs G; 
A(m—1,n—1) Am-1 9 An 1G U(1) (m, n) & (m,n) 
A(n —1,n— 1) An_1 B Ap. 4 (n, n) 2: (n, n) 
C(n + 1) C, & U(1) (2n) & (2n) 
B(m, n) Bm C, (2m 4- 1, 2n) 
D(m, n) Dm ® Cn (2m, 2n) 

F(4) A; 3 B3 (2, 8) 

G(3) Ai P G (2,7) 

D(2, 1; a) A4 $ A1 $ AY (2,2,2) 

P(n) An [2] @ [1"""] 
Q(n) An ad(A,) 


reducible. They are classified into four infinite 
families called W(r) with n > 2, S(n) with n > 3, 
S(n), and H(n) with n > 4. S(m) and S(n) are called 
special Cartan type Lie superalgebras and H(z) 
Hamiltonian Cartan type Lie superalgebras. 


Classical Lie Superalgebras 


The classical Lie superalgebras are described as matrix 
superalgebras as follows. Let Y — V; V; be a Z2- 
graded vector space, with dim Y; —»:, dim Y; — n. 
The Lie superalgebra gl(m|n) is defined as the super- 
algebra End Y = End; Y @ End; V supplied with the 
Lie superbracket. 

The unitary superalgebra A(m — 1,” — 1) — sl(»n | n) 
is defined as the superalgebra of matrices M € gl(m|n) 
satisfying the supertrace condition str(M) — 0. In the 
case m =n, sl(n|n) contains a one-dimensional ideal Z 
generated by lp and one sets A(z— 1,4 — 1)— 
sl(n |n)/T = psl(n |n). 

The orthosymplectic superalgebra osp(m|2z) is 
defined as the superalgebra of matrices M € gl(zi | n) 


satisfying the conditions 
A '— —A, D'G=-GD, B=CG 


where t denotes the usual transposition and the 
matrix G is given by 


0 I, 
e=(4, 6) 


Table 2  7.-gradation of the classical basic Lie superalgebras 
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The strange superalgebra P(n) is defined as the 
superalgebra of matrices M € gl(n |n) satisfying the 
conditions 


A'—-D, B'—B, C —-—C, tr(A)=0 
The strange superalgebra O(n) is defined as the 
superalgebra of matrices M € gl(n |n) satisfying the 


conditions 


A=D, B=C, te(5)=0 
The superalgebra Q(r) has a one-dimensional center 
Z. The simple superalgebra O(n) is given by 


O(n) = O(n)/Z. 


Structure of the Classical Lie 
Superalgebras 


Let G— 65 ® G; be a classical Lie superalgebra. A 
Cartan subalgebra H of G is defined as a Cartan 
subalgebra of G-, that is, the maximal nilpotent 
subalgebra of G; coinciding with its own normal- 
izer: H= {X € G| [X, H] € H}. It follows that the 
Cartan subalgebras of a Lie superalgebra are 
conjugate since the Cartan subalgebras of a Lie 
algebra are conjugate and any inner automorphism 
of the even part Gj can be extended to an inner 
automorphism of G; hence, they all have the 
same dimension. By definition, the dimension of 
a Cartan subalgebra H is the rank of G:rankG= 
dim H. 

A classical Lie superalgebra G with Cartan 
subalgebra H can be decomposed as G= (D, Go 
(HC is the dual of H), where 


Ga = {x E€ G| [b,x] = a(b)x, b € H} 
The set A C H* 
A = {a € ^t Ga 3 01 


is by definition the root system of G. A root a is 
called even (resp. odd) if G,n85 #0 (resp. 


Superalgebra G Go Q4 B Ga Go B Go 
A(m — 1,n — 1) Am—1 ® An-1 © U(1) (m, n) & (m, n) 

A(n — 1,n— 1) An. 4 ® An. 4 (n, n) & (n, n) 

C(n + 1) C, $ U(1) (2n), & (2n). 

B(m,n) Bm $9 An-1 © U(1) (2m + 1, n) & (2m + 1,n) [2] & [2^7] 
D(m,n) Dm & An-1 ® U(1) (2m, n) 2 (2m, n) [2] & [2^7] 
F(4) Bz © U(1) 8,098. 1.091. 
G(3) Go ® U(1) 7T. 07. 1, 9 1. 
D(2, 1; a) A; SA; © U(1) (2,2), @ (2,2). 1.08 1. 


308 Lie Superalgebras and Their Representations 


Ga N G7 Æ 0). The set of even roots A; is the root 
system of the even part G of G. The set of odd root 
A; is the weight system of the representation of Gg 
in G. One has A= A5; UA. A root can be both 
even and odd (however this only occurs in the case 
of the superalgebra Q(z)). The vector space spanned 
by all the possible roots is called the root space. It is 
the dual H* of the Cartan subalgebra H as vector 
space. 

Except for A(1, 1), P(n), and Q(z), using a non- 
degenerate invariant bilinear form B on the super- 
algebra G, one can define a bilinear form (-,-) 
on the root space H* by (a;,a;) 2 B(Hj, Hj), where 
the H; form a basis of H. The following properties 
hold: 


l. Gino) =H except for O(n). 
2. dim G4, —1 when a Æ 0 except for A(1, 1), P(2), 
P(3), and O(n). 
3. Except for A(1, 1), P(n), O(n), one has 
(a) [4,05] Z 0 if and only if o, 8,a + B € ^, 
(b) (Ga, Ga) — 0 for a + B z 0, 
(c) if a € A (resp. As, Az), then —o € A (resp. 
As, Az) and 
(d) à € A = 2a € A if and only if a € A; and 


(a.a) Æ 0. 


In the rest of this section, we restrict to the case 
of a basic Lie superalgebra G of rank r, with Cartan 
subalgebra H and root system A= A; U A;. Then G 
admits a Borel decomposition G=N' Ho , 
where A^ are subalgebras such that [H,.N ^] C N+ 
with dim NV’ = dim N .If G=H@,, Go is the root 
decomposition of G, a root «a is called positive if 
Ga NNF Æ () and negative if Ga IN Æ (. A root is 
called simple if it cannot be decomposed into a sum 
of positive roots. The set of all simple roots 1s 
called a simple root system of G and is denoted here 
by A?. The set B=H@®N° is called a Borel 
subalgebra of G. Such a Borel subalgebra is solvable 
but not maximal solvable. Indeed, adding to B a 
negative simple isotropic root generator (i.e., a 
generator associated to an odd root of zero length), 
the obtained subalgebra is still solvable since the 
superalgebra sl(1|1) is solvable. However, B con- 
tains a maximal solvable subalgebra 5; of the even 
part Go. 

In general, for a basic Lie superalgebra G, there 
are many inequivalent classes of conjugacy of Borel 
subalgebras (while for the simple Lie algebras, all 
Borel subalgebras are conjugate). 

To each class of conjugacy of Borel subalgebras of 
G is associated a simple root system A". Hence, 
contrary to the Lie algebra case, to a given basic 
Lie superalgebra G will be associated in general 


many inequivalent simple root systems, up to a 
transformation of the Weyl group W(G) of G (the 
Weyl group of a basic Lie superalgebra being 
generated by the Weyl reflections with respect to 
the even roots; under a transformation of W(G), a 
simple root system will be transformed into an 
equivalent one with the same Dynkin diagram). The 
generalization of the Weyl group for a basic Lie 
superalgebra G gives a method for constructing all 
the simple root systems of G and hence all the 
inequivalent Dynkin diagrams of G. For a € Az, one 
defines 


W,(3) = B-— 2 tos B) a if (a,a) #0 
(a, a) 

Wy(8)=B+a if (a,a) =0, (a,8) 40 

waif) =p it (a,a) = (a, 6) =0 

Wala) = —a 


Note that the transformation associated to an odd 
root a of zero length cannot be lifted to an 
automorphism of the superalgebra since wa trans- 
forms even roots into odd ones, and vice versa, and 
the Z»-gradation would not be respected. A simple 
root system A being given, from any root a € A? 
such that (a,a@)=0, one constructs the simple root 
system 1w/,(A"), where wa is the generalized Wey] 
reflection with respect to œ and one repeats the 
procedure on the obtained system until no new basis 
arises. 

In the set of all inequivalent simple root 
systems of a basic Lie superalgebra, there is one 
simple root system that plays a particular role, 
the distinguished simple root system, for which 
the number of odd roots is equal to one, 
constructed as follows. Consider the distinguished 
Z-gradation of G,G= iez G; The even simple 
roots are given by the simple root system of the 
Lie subalgebra Go and the odd simple root is the 
lowest weight of the representation G; of Go. See 
Table 3 for the root systems and Table 4 for the 
distinguished simple root systems of the basic Lie 
superalgebras. 

Let A?—(o,,...,0,) be a simple root system 
of G, such that (a; aj) € Z and | min (a;,a;)|=1 if 
(aj, 0;) Æ 0. Then one defines the symmetric Cartan 
matrix a with integer entries as aj =(a;,a;). One 
associates to A" a Dynkin diagram according to the 
following rules: 


1. One associates to each simple even root a white 
dot, to each simple odd root of nonzero length 
(aj; 4 0) a black dot, and to each simple odd root 
of zero length (aj; — 0) a gray dot. 


Table 3 Root systems Aj, A; of the basic Lie superalgebras 
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Superalgebra G As A4 

A(m —4,n — 1) Ej — Ej, OK — Oy +(e; — dx) 

B(m,n) wes oe Ejy bey, tb, £6), 2d, ce; + ók; ôk 

BO, n) +d, + 6), £26, Og 

C(n + 1) +5, Xj X26, ced dy 

D(m,n) = a ee at +h, + 6), £20, der d: OR 

F(4) +ô, be; d y, HE; i(ce + £2 + £3 + ô) 
G(3) +26, Ej, Ei — Ej +6, te; +6 

D(2, 1; a) +2¢; tey ten tes 


1<i,j<m,1<k,!l <n for A(m—1,n— 1), B(m,n), C(n+ 1), D(m,n). 1 < ij <3 for F(4), G(3), D(2, 1; a), with £4 + «2 +£3=0 in 
the case of G(3). For A(n — 1,n — 1), one has to add the condition £4 + --- + En —64 +--+ dp. 


Table 4 Distinguished simple root systems of the basic Lie superalgebras 


Superalgebra G 


A(m —1,n — 1) 
B(m, n) 

B(0, n) 

C(n) 

D(m, n) 

F(4) 

G(3) 

D(2,1;a) 


N 


. The ith and jth dots are joined by 7; lines where 
2 |aij| 


n; ——————  dfajga;z0 
à min( |a;j;. laj;|) uH 
2|ai; 
ES if a; #40 and a; = 0 
"i miní(|a;;|. 2.) B * : 
Nii = |aij| if aii = aj = 0 


3. We add an arrow on the lines connecting the ith 
and jth dots when 7; > 1, pointing from 7 to j if 
Aid; z 0 and lai;| 2 la;;| or if aj; = 0,aj £ 0, 
\a;;| <2, and pointing from j to i if aj;;— 0, 
aj; FO, |ajj| > 2. 

4. For D(2, 1; o), 7j; = 1 if aj; if) and niv=—0 if 
dj; = 0. No arrow is put'on the Dynkin diagram. 


The distinguished Dynkin diagrams of the basic Lie 
superalgebras are listed in Table 5. 


Representation Theory of Basic Lie 
Superalgebras 


We restrict in the following to the basic Lie 
superalgebras. We assume that G Æ psl(n,7) but the 
following results still hold for sl(z |). Let G—.N ® 
HBN be a Borel decomposition of G where A^ 
(resp. V ) is spanned by the positive (resp. negative) 
root generators of G, H is a Cartan subalgebra, and 
H* is the dual of H. A representation 7:G— End V 
with representation space V is called a highest- 


Distinguished simple root system A? 


04 — 00,..., On—-1 — On. On — £1,£1 — £9,...,€£m-1 — Em 

01 — 02, 004 Og—1 — Og, On — £1, E1 — £2; »5m-1--*m. m 

lio = Bo cmos Gea =O A 

= — 04,601 — 6o0,..., Ón 1 — bn, 26n 

04 — 02,...,0n.1 — On. On — E4,E4 — £2,...,£m-1 — Em, Em-1 + Em 


1 s = = e. Bm = 3 
z (0 — E1 —£2 — 63), E3 E2 — E361 — £2 
b+ £9,E1,£2 — £4 

Ej = €2 = Eg; 2€2, Q£3 


weight representation with highest weight A € H* if 
there exists a nonzero vector v4 € V such that 


N wa zs) 
b(va) = A(h)va(h € H) 


The G-module Y is called a highest-weight module, 
denoted by V(A), and the vector v4 € V a highest- 
weight vector. From now on, H is the distinguished 
Cartan subalgebra of G with basis of generators 
(Hi,..., H,) where r=rankG and H, denotes the 
Cartan generator associated to the odd simple root. 
The Kac-Dynkin labels are defined by 


for is$s and a,= (A,a,) 

A weight A € H* is called a dominant weight if a; > 0 
for alli Z s, integral if a; € Z for all i Z s, and integral 
dominant if a; € Zo for all i zs. A necessary 
condition for the highest-weight representation of G 
with highest weight A to be finite dimensional is that A 
be an integral dominant weight. 

One then defines the Kac module. Consider 
G= ®iez. Gi the distinguished Z-gradation of G and 
let K=Go N^, where N^ = @iso Gi be a sub- 
algebra of G. Denote by U(G) and U(K) the 
corresponding universal enveloping superalgebras. 
Let A € H* be an integral dominant weight and 
Vo(A) be the Go-module with highest weight A, 
which is extended to a K-module by setting 
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Table 5 Distinguished Dynkin diagrams of the basic Lie 
superalgebras 


Superalgebra G Distinguished Dynkin diagram 
Am-tn-1) —  Q---O—8—O---O 

B(m, n) Q-—0—8G-— —O---O-20 
B(0, n) O---0—279 

Cin +1) G—O-- -O2CO 

D(m, n) Q--—-0—6-—40-- ac 


F(4) G—OSCO—O 
G(3) G.—— O&O 


«o 


A^ Vo(A) =0. From this K-module, it is possible to 
construct a G-module in the following way. One 
considers the factor space U(G) Qu) Vo(A) consist- 
ing of elements of U(G)®Vo(A) such that the 
elements hb à v and 1@h(v) have been identified 
for b € K and v € Vo(A). This space acquires the 
structure of a G-module by setting g(u & v) — gu v 
for u € U(G),g € G, and v € Vo(A). This G-module is 
called the induced module from the K-module Vo(A) 
and denoted by Ind Vo(A). For example, in the case 
of type I basic Lie superalgebras, if (fi,...,/;] 
denotes a basis of odd generators of G/K, then 


CD fef Vo(A) 


1 €i «--- «i, <d 


The Kac module VA) is defined as follows: 


D(2, 1; a) 


Ind Vo(A) = 


1. For a superalgebra G of type I (the odd part is the 
direct sum of two irreducible representations of the 
even part), the Kac module is the induced module 


V(A) = Indf. Vo(A) 


2. For a superalgebra G of type II (the odd part is an 
irreducible representation of the even part), the 


induced module Ind; Vo(A) contains a submodule 
M(A) 2 u U(G)G"* Vo(A), where « is the longest 
simple root of Gj which is hidden behind the odd 
simple root — that is, the longest simple root of 
sp(2m) in the case of osp(m|2m) and the simple 
root of sl(2) in the case of F(4),G(3), and 
D(2,1;o) — and b —2(A,v)/(v, v) is the compo- 
nent of A with respect to v». The Kac module is 
defined as the quotient of the induced module 
Indj. Vo(A) by the submodule M(A): 


V(A) = Ind Vo(A)/U(9)G S! Vo(A) 


In the case where the Kac module is not simple, it 
contains a maximal woo Y" and the 
quotient module V(A)=V(A)/Z(A) is a simple 
module. 

The fundamental result concerning the representa- 
tions of basic Lie superalgebras is the following: 


1. Any finite dimensional irreducible representation 
of G is of the form V(A) = V(A)/Z(A), where A is 
an integral dominant weight. 

2. Any  finite-dimensional simple G-module is 
uniquely characterized by its integral dominant 
weight A: two G-modules V(A) and V(A') are 
isomorphic if and only if A= A’. 

3. The finite-dimensional simple G-module V(A) = 
V(A)/Z(A) has the weight decomposition 


= Qv, 


A<A 
with 


Vy = {v € VIbw) = A(b)v,h € H} 


The presence of odd roots will have another 
important consequence in the representation theory 
of superalgebras. Indeed, one might find that in certain 
representations, weight vectors, different from the 
highest one specifying the representation, are annihi- 
lated by all the generators corresponding to positive 
roots. Such vector have, of course, to be decoupled 
from the representation. Representations of this kind 
are called atypical, while the other irreducible repre- 
sentations not suffering this pathology are called 
typical. For a basic Lie superalgebra G with root 
system A, one defines Ac — {a € Aglo/2€ A1) and 
A; — [o € A;|2a¢ Ag]. Let po be the "halbes of the 
" of Aż , pı the Te pe of the roots of A*, and 
p-—po— pı. The representation m with Kighest 
weight A is called typical if 


(A+p,a)#0 forall a E€ Ar 


The highest weight A is then called typical. If 
there exists some a € A; such that (A + p,a)=0 


the representation m and the highest weight A are 
called atypical. The number of distinct elements o € 
A; for which A is atypical is the degree of 
atypicality of the representation mw. If there exists 
one and only one a € Ar such that (A + p, o) — 0, 
the representation m and the highest weight A are 
called singly atypical. 

The Kac module Y(A) is a simple G-module if and 
only if the highest weight A is typical. All the finite- 
dimensional representations of B(0, n) are typical. All 
the finite-dimensional representations of C(n + 1) are 
either typical or singly atypical. 

The dimension of a typical finite-dimensional 
representation Y of G is given by 


dim V(A) 2 2°" TT Uer uad 
ved (po, a) 


where dim V5(A) = 
G = B(0, n), 


dim Vz(A) if G A B(0,n), and if 


The atypicality conditions are the following: 
e For A(m, 1) with A= (hyo. s avy Ease) 


s 1 An+4 ama n- 1 


o 
n—1 
2 4. 
k=i 


where 1 <i<n<j<m-+n-—1. 
. B(m, n) with A= (41. . ‘s Am+n) (mM * 0) 


j 
`o d, d-d4 =i +]— 2n 


k—n4-1 


4n.1 8m.n-1 Amin 


n / 
N a- > ag=it+j—2n 
g=ş q=n+1 
n j m-4-n-—1 
> 4q- 9, 4q-2 Ý lq- amin 
q=i g=n-+1 q=j+1 
=2m+i-j—1=0 
where 1 <i<n<j<min-1l. 
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e C(n 4- 1) with A = (a1,...,4544) 
ay do ân — Ans 1 
ei 
i 
a mala 
n4-1 
Mai ` ü4—28--i—1-—0 
q=i+1 
where 1 € i € n. 
e D(m|n) with A—(a1,...,45,45) 
ay an—1 ap ân+1 Am+n-2 Ans m—1 
O-=-O—- 3 —-O-- 
na m 
n ] 
q—i g=n+1 
where 1 <i<n<j<m+4+n-—-1 
m+n—2 
Ya- — Amyn =M—n+i— 1 
q=n+1 
where 1 «i € n 
n m+n—2 
Ya-Y 413.4 
q=i q=n+ 1 q=f+1 


= Am+n-1 + Ginn + 200 +i — i — 2 
where 1 <i<n<j<m+n-—-2 


See also: Lie Groups: General Theory; Lie, Symplectic, 
and Poisson Groupoids and Their Lie Algebroids. 
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Introduction 


Groupoids are mathematical structures able to describe 
symmetry properties more general than those described 
by groups. They were introduced (and named) by 
H Brandt in 1926. Around 1950, Charles Ehresmann 
used groupoids with additional structures (topological 
and differentiable) as essential tools in topology and 
differential geometry. In recent years, Mickael Karasev, 
Alan Weinstein, and Stanistaw Zakrzewski indepen- 
dently discovered that symplectic groupoids can be used 
for the construction of noncommutative deformations 
of the algebra of smooth functions on a manifold, with 
potential applications to quantization. Poisson group- 
oids were introduced by Alan Weinstein as general- 
izations of both Poisson Lie groups and symplectic 
groupoids. 

We present here the main definitions and first 
properties relative to groupoids, Lie groupoids, Lie 
algebroids, symplectic and Poisson groupoids and 
their Lie algebroids. 


Groupoids 
What is a Groupoid? 


Before stating the formal definition of a groupoid, let us 
explain, in an informal way, why it is a very natural 
concept. The easiest way to understand that concept is 
to think of two sets, Ir and To. The first one, T, is called 
the *set of arrows" or *total space" of the groupoid, 
and the other one, l'o, the “set of objects” or “set of 
units" of the groupoid. One may consider an element 
x €T as an arrow going from an object (a point in To) 
to another object (another point in To). The word 
"arrow" is used here in a very general sense: it means a 
way for going from a point in [o to another in To. One 
should not consider an arrow as a line drawn in the set 
Lo joining the starting point of the arrow to its 
endpoint: this happens only for some special groupoids. 
Rather, one should think of an arrow as living outside 
lo, with only its starting point and its endpoint in Io, as 
shown in Figure 1. 

The following ingredients enter the definition of a 
groupoid. 


1. Two maps a: — Tọ and 8:T' — To, called the 
“target map" and the “source map" of the 


groupoid. If x € l' is an arrow, a(x) € lo is its 
endpoint and B(x) € To its starting point. 

2. A "composition law” on the set of arrows; we can 
compose an arrow y with another arrow x, and get 
an arrow m/(x,y), by following first the arrow y, 
then the arrow x. Of course, m(x, y) is defined if and 
only if the target of y is equal to the source of x. The 
source of m(x, y) is equal to the source of y, and its 
target is equal to the target of x, as illustrated in 
Figure 1. It is only by convention that we write 
m(x,y) rather than m(y,x): the arrow which is 
followed first is on the right, by analogy with the 
usual notation f og for the composition of two 
maps g and f. When there is no risk of confusion, we 
write x o y, or x. y, or even simply xy for m(x, y). 
The composition of arrows is associative. 

3. An “embedding” £ of the set Io into the set F, which 
associates a unit arrow e(u) with each u € To. 
That unit arrow is such that both its source and its 
target are u, and it plays the role of a unit when 
composed with another arrow, either on the right or 
on the left: for any arrow x, m(e(a(x)),x) =x, and 
ntix. ei Di). 

4. Finally, an “inverse map" 1 from the set of 
arrows onto itself. If x € I is an arrow, one may 
think of (x) as the arrow x followed in the 
reverse sense. We often write x^! for (x). 


Now we are ready to state the formal definition of 
a groupoid. 


Definition 1 A groupoid is a pair of sets (L, D) 
equipped with the structure defined by the following 
data: 


(i) an injective map &:l'o9—ED, called the unit 
section of the groupoid; 

(ii) two maps a:L—FIo and 8: —Tọo, called, 
respectively, the target map and the source 
map; they satisfy 


aoe=foe=idr, [1] 


(iii) a composition law m:I3—>TPT, called the pro- 
duct, defined on the subset rə of P xT, called 
the set of composable elements, 


Tz ={(x,y) eT xT; B(x) = a(y)) [2] 
m(X y) r 
Lo 


a(mí(x,y)) = a(x) A(x)=a(y) BY) = 3(m(x,y)) 


Figure 1 Two arrows xand y c I’, with the target of y, a(y) € Uo, 
equal to the source of x, ?(x) € To, and the composed arrow m(x, y). 
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which is associative, in the sense that whenever 
one side of the equality 


m(x,m(y,z)) = m(m(x, y), z) i3] 


is defined, the other side is defined too, and the 
equality holds; moreover, the composition law 
m is such that for each x € TL, 


m(e(a(x)), x) = m(x.e(B(x))) = x |4] 


— 


a map «: T — T, called the inverse, such that, for 
every x € I’, (x,u(x)) € T2 and (u(x), x) € T2, and 


m((x),x)-—e(80(x)) [5] 


(iv 


m(x, i(x)) = e(o(x)), 


The sets I and T are called, respectively, the 
total space and the set of units of the groupoid, 
which is itself denoted by T -j Lo. 


Identification and Notations 


In what follows, by means of the injective map £, we 
will identify the set of units Fo with the subset £e(To) 
of T. Therefore, £ will be the canonical injection in T 
of its subset To. 

For x and y € TL, we will sometimes write x.y, or 
even simply xy for m(x,y), and x! for u(x). In 
addition, we will write “the groupoid T” for “the 
groupoid PSI.” 


Properties and Comments 


The above definitions have the following consequences. 


Involutivity of the inverse map The inverse map 1 
is involutive: 


Lo L = idy [3 
We have indeed, for any x € E, 


Lolz) —m(uou(x), Bleo t(x))) 


=el l a ilk) BX) == iuo (x), sx). x)) 


5 
k 
| 
3 
2 
O 
k 
| 
a 


) 
— m(m(.o (x), 


Unicity of the inverse Let x and y € P be such that 


m(x,y)=a(x) and m(y,x) = f(x) 
Then we have 
y =m(y, B(y)) = m(y, a(x)) 
(y, m(x, ((x))) = m(m(y, x), e(x)) 
= mix), «x)) = mio(ilx)), t(x)) = u(x) 


Therefore for any x € T, the unique y € T such that 
m(y, x) = B(x) and m(x, y) ^ a(x) is u(x). 


The fibers of o and 8 and the isotropy groups The 
target map a (resp. the source map 5) of a groupoid 
ral determines an equivalence relation on T: 
two elements x and y € T are said to be a-equivalent 
(resp.  J-equivalent) if a(x)=a(y) (resp. if 
B(x) = B(y)). The corresponding equivalence classes 
are called the a-fibers (resp. the (-fibers) of the 
groupoid. They are of the form a! (u) (resp. 9^! (u)), 
with 4 € D'o. 
For each unit u € Io, the subset 


T, =a! (u)f "d (u) 
=f EL; alx) = B) = 2} [7] 


is called the “isotropy group” of u. It is indeed a 
group, with the restrictions of m and 1 as composi- 
tion law and inverse map. 


A way to visualize groupoids We have seen 
(Figure 1) a way in which groupoids may be 
visualized, by using arrows for elements in I’ and 
points for elements in Tọ. There is another very 
useful way to visualize groupoids, shown in 
Figure 2. 

The total space I’ of the groupoid is represented as 
a plane, and the set [o of units as a straight line in that 
plane. The a-fibers (resp. the 8-fibers) are represented 
as parallel straight lines, transverse to To. 


Examples of Groupoids 


The groupoid of pairs Let E be a set. The “group- 

oid of pairs" of elements in E has, as its total 

space, the product space E xE. The diagonal 

Ap ={(x,x);x € E] is its set of units, and the target 

and source maps are 
o:(x,y)e Qux) — B:(xy)e(yy) 

Its composition law m and inverse map : are 

m((x. y), (y. z)) = (x. x) 


o((x, y)) = Gy)! = (y, x) 


Groups A group G is a groupoid with set of units 


{e}, with only one element e, the unit element of the 


Figure 2 A way to visualize groupoids. 


314 Lie, Symplectic, and Poisson Groupoids and Their Lie Algebroids 


group. The target and source maps are both equal to 
the constant map x — e. 


Definition 2 A topological groupoid is a groupoid 
Daly for which T is a (maybe non-Hausdorff) 
topological space, lo a Hausdorff topological subspace 
of I’, a and f surjective continuous maps, : I; — Ta 
continuous map, andi: —^ la homeomorphism. 

A Lie groupoid is a groupoid Talo for which 
l' is a smooth (maybe non-Hausdorff) manifold, Ioa 
smooth Hausdorff submanifold of T, a and 9 smooth 
surjective submersions (which imeles that T3 is a 
smooth submanifold of T x T), 5:1; — T a smooth 
map, and 1:1 — T a smooth diffeomorphism. 


Properties of Lie Groupoids 


Dimensions Let P= be a Lie groupoid. Since a 
and (3 are submersions, for any x ET, the a-fiber 

a !(a(x)) and the 8- fiber 8 (B(x)) are sabmanifolds 
of T, both of dimension dim T — dim Tro. The inverse 
map +, restricted to the a-fiber through x (resp. the 
B-fiber through x), is a diffeomorphism of that fiber 
onto the £-fiber through +(x) (resp. the a-fiber 
through ;(x)) The dimension of the submanifold 
I? of composable pairs in I x T is 2dim T — dim T. 


The tangent bundle of a Lie groupoid Let To be 
a Lie groupoid. Its tangent bundle TT is a Lie 
groupoid, with TTo as set of units, Ta: TT — TTo 
and T8: TT — TT as target and source maps. Let us 
denote by T2 the set of composable pairs in T x T, by 
m:lL;-Fthecomposition law, and by 1: — P the 
inverse. Then the set of composable pairs in TT x TT 
is simply TT), the composition law on TT is 
Tm: TT — TT, and the inverse is Te: TT — TT. 
When the groupoid PI is a Lie group G, the Lie 
groupoid TG is a Lie group too. We will see that 
the cotangent bundle of a Lie groupoid is a Lie 
groupoid, and more precisely a symplectic groupoid. 


Isotropy groups For each unit u € l'; of a Lie 
groupoid, the isotropy group I’, (defined earlier) is a 
Lie group. 


Examples of Topological and Lie Groupoids 


Topological groups and Lie groups A topological 
group (resp. a Lie group) is a topological groupoid 
(resp. a Lie groupoid) whose set of units has only 
one element e. 


Vector bundles A smooth vector bundle 7: E — M 
on a smooth manifold M is a Lie groupoid, with the 
base M as set of units (identified with the image of 
the zero section); the source and target maps both 
coincide with the projection 7; the product and the 


inverse maps are the addition (x, y) — x + y and the 
opposite map x — —x in the fibers. 


The fundamental groupoid of a topological space Let 
M be a topological space. A “path” in M is a 
continuous map y:[0,1] —^ M. We denote by [y] the 
homotopy class of a path y and by II(M) the set of 
homotopy classes of paths in M (with fixed end- 
points). For [y] € H(M), we set ally] =7(1), 
B(Ey]) 2 4(0), where y is any representative of the 
class [y]. The concatenation of paths determines a 
well-defined composition law on II(M), for which 
II(M)-:M is a topological groupoid, called the 
“fundamental groupoid” of M. The inverse map is 
[y] [yt], where y is any representative of |y] and 
y is the path tr+4(1—t). The set of units is M, if 
we identify a point in M with the homotopy class of 
the constant path equal to that point. 

When M is a smooth manifold, the same 
construction can be made with piecewise smooth 
paths, and the fundamental groupoid II(M)=M is a 
Lie groupoid. | 


Symplectic and Poisson Groupoids 
Symplectic and Poisson Geometry 


Let us recall some definitions and results in 
symplectic and Poisson geometry, used in the next 
sections. 


Symplectic manifolds A “symplectic form" on a 
smooth manifold M is a differential 2-form w, which 
is closed, that is, which satisfies 


dw = 0 (8) 


and nondegenerate, that is, such that for each point 
x € M and each nonzero vector v € T,M, there 
exists a vector w E€ TM such that w(v,w) 40. 
Equipped with the symplectic form w, a smooth 
manifold M is called a “symplectic manifold” and 
denoted by (M, w). 

The dimension of a symplectic manifold is always 
even. 


The Liouville form on a cotangent bundle Let N 
be a smooth manifold, and T*N be its cotangent 
bundle. The Liouville form on T*N is the 1-form 6 
such that, for any 7 € T*N and v € T,(T*N), 


(v) = (n, Tzw(v)) [9] 


where ty: T*N — N is the canonical projection. 
The 2-form w= dÓ is symplectic, and is called the 
“canonical symplectic form" on the cotangent 


bundle T*N. 
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Poisson manifolds A Poisson manifold is a smooth 
manifold P equipped with a bivector field (i.e., a 
smooth section of ^? TP) II which satisfies 


(IL, IT| = 0 10] 


the bracket on the left-hand side being the Schouten 
bracket. The bivector field II will be called the 
Poisson structure on P. It allows us to define a 
composition law on the space C*(P, R) of smooth 
functions on P, called the Poisson bracket and 
denoted by (f,g)— (f,g], by setting, for all f and 
g € C*(P, R) and x € P, 


{f,g}(x) = I(df (x), dg(x)) [11] 


That composition law is skew-symmetric and satis- 
fies the Jacobi identity, therefore turns C% (P, R) into 
a Lie algebra. 


Hamiltonian vector fields Let (P,II) be a Poisson 
manifold. We denote by II’: T*P — TP the vector 
bundle map defined by 


(n, (C) Y = II(6, ) [12] 


where ¢ and 7j are two elements in the same fiber of 
T*P. Let f : P —^ R be a smooth function on P. The 
vector field X; —Il'(df) is called the Hamiltonian 
vector field associated to f. If g: P — R is another 
smooth function on P, the Poisson bracket [f, g} can 
be written as 


(f, g} = (dg,I(df)) = —(df,IP(dg)) [13] 


The canonical Poisson structure on a symplectic 
manifold Every symplectic manifold (M,w) has a 
Poisson structure, associated to its symplectic 
structure, for which the vector bundle map 
I: T'M — M is the inverse of the vector bundle 
isomorphism v —i(v)w. We will always consider 
that a symplectic manifold is equipped with that 
Poisson structure, unless otherwise specified. 


The KKS Poisson structure Let G be a finite- 
dimensional Lie algebra. Its dual space G' has a 
natural Poisson structure, for which the bracket of 
two smooth functions f and g is 


{f,g}(€) = (& [df (€), dg(&))) [14] 
with € € G', the differentials df(£) and dg(£) being 


considered as elements in G, identified with its 
bidual G". It is called the Kirillov, Kostant, and 
Souriau (KKS) Poisson structure on G'. 


Poisson maps Let (P;,II;) and (P5,II;) be two 
Poisson manifolds. A smooth map y:P; — P2 is 
called a Poisson map if, for every pair (f,g) of 
smooth functions on P5, 


Lef pgh v.s [15] 


Product Poisson structures The product P; x P; 
of two Poisson manifolds (P4,II;) and (P5,1I2) has 
a natural Poisson structure: it is the unique 
Poisson structure for which the bracket of 
functions of the form (xi1,x5) fi(x1)f2(x2) and 
(x1,x2) 5 g1(xi)go(x2) (where f; and gy; € C™® 
(P1, R), f and g2 € C**(P5, R)) is 


(x1,x2) ^ tfi g1 } 091) {f2, 2235 (x2) 


The same property holds for the product of any 
finite number of Poisson manifolds. 


Symplectic orthogonality Let (V,w) be a symplectic 
vector space, that means a real, finite-dimensional 
vector space V with a skew-symmetric nondegenerate 
bilinear form w. Let W be a vector subspace of V. 
The “symplectic orthogonal” of W is 


orth W = (v € Viw(v,w) = 0 for all we W} [16] 
It is a vector subspace of V, which satisfies 
dim W+dim(orthW)=dim V,  orth(orth W) = W 


The vector subspace W is said to be isotropic if 
W Corth W, coisotropic if orthW C W, and 
Lagrangian if W —orth W. In any symplectic vector 
space, there are many Lagrangian subspaces; there- 
fore, the dimension of a symplectic vector space is 
always even; if dim V=2n, the dimension of an 
isotropic (resp. coisotropic, resp. Lagrangian) vector 
subspace is <n (resp. 2 7, resp. =n). 


Coisotropic and Lagrangian submanifolds A sub- 
manifold N of a Poisson manifold (P, II) is said to be 
coisotropic if the bracket of two smooth functions, 
defined on an open subset of P and which vanish on 
N, vanishes on N too. A submanifold N of a 
symplectic manifold (M, w) is coisotropic if and only 
if for each point x € N, the vector subspace T&N of 
the symplectic vector space (T4 M,w(x)) is coisotro- 
pic. Therefore, the dimension of a coisotropic 
submanifold in a 27-dimensional symplectic mani- 
fold is > n; when it is equal to n, the submanifold N 
is said to be Lagrangian. 


Poisson quotients Let :M — P be a surjective 
submersion of a symplectic manifold (M,w) onto a 
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manifold P. The manifold P has a Poisson structure 
Il for which is a Poisson map if and only if 
orth( ker Ty) is integrable. When that condition is 
satisfied, that Poisson structure on P is unique. 


Poisson Lie groups A Poisson Lie group is a Lie 
group G with a Poisson structure II, such that the 
product (x, y)— xy is a Poisson map from G x G, 
endowed with the product Poisson structure, into 
(G, II). The Poisson structure of a Poisson Lie group 
(G,II) always vanishes at the unit element e of G. 
Therefore, the Poisson structure of a Poisson Lie 
group never comes from a symplectic structure on 
that group. 


Definition 3 A symplectic groupoid (resp. a Pois- 
son groupoid) is a Lie groupoid [T=Iy with a 
symplectic form w on FP (resp. with a Poisson 
structure II on Dl) such that the graph of the 
composition law m 


{(x,y,z) ET x P xI;(x, y) € L5 and z = m(x,y)) 
is a Lagrangian submanifold (resp. a coisotropic 
submanifold) of PL xTDLxI with the product 


symplectic form (resp. the product Poisson structure), 
the first two factors [ being endowed with the 
symplectic form w (resp. with the Poisson structure II), 
and the third factor T being r with the symplectic form 
—w (resp. with the Poisson structure AI). 


The next theorem states important properties of 
symplectic and Poisson groupoids. 


Theorem 4 Let TTo be a symplectic groupoid 
with symplectic 2 -form w (resp. a Poisson groupoid 
with Poisson structure II). We have the following 
properties. 


(i) For a symplectic groupoid, given any point 
c ET, each one of the two vector subspaces of 
the symplectic vector space (T-T',w(c)), 


T.(8(8(c)) and T,.(a'(a(c))) 


is the symplectic ortbogonal of tbe otber one. 
For a symplectic or Poisson groupoid, if f is 
a smootb function wbose restriction to eacb 
a-fiber is constant, and g a smootb function 
whose restriction to each D-fiber is constant, 
then the Poisson bracket {f,g\ vanishes 
identically. 

The submanifold of units To is a Lagrangian 
submanifold of the symplectic manifold (V,w) 
(resp. a coisotropic submanifold of the Poisson 
manifold (V, 11)). 

The inverse map 1:1 — TV is an antisymplecto- 
morphism of (U,w), that is, it satisfies iw — —w 


ae 
— « 
— + 
— 


— 


(iii 


(resp. an anti-Poisson diffeomorphism of (I, 11), 
i.e., it satisfies „II = —II). 


Corollary 5 Let T -3To be a symplectic groupoid 
with symplectic 2 -form w (resp. a Poisson group- 
oid with Poisson structure I1). There exists on To a 
unique Poisson structure Ilo for which a: —> To 
is a Poisson map, and 3:1 — To an anti-Poisson 
map (i.e., B is a Poisson map when To is equipped 
with tbe Poisson structure —Io). 


Examples of Symplectic and Poisson Groupoids 


The cotangent bundle of a Lie groupoid Let T T 0 
be a Lie groupoid. | 

We have seen above that its tangent bundle TT 
has a Lie groupoid structure, determined by that of 
DL. Similarly (but much less obviously), the cotan- 
gent bundle T*E has a Lie groupoid structure 
determined by that of I. The set of units is the 
conormal bundle to the submanifold ro of T, 
denoted by N To. We recall that AN To is the vector 
sub-bundle of Tf TP (the restriction to [9 of the 
cotangent bundle T*T), whose fiber N; To at a 
point p € Io is 


N; Lo = in € D (n, V) = for all VE T,To} 


To define the target and source maps of the 
Lie algebroid T*T, we introduce the notion of 
“bisection” through a point xer. A bisection 
through x is a submanifold A of T, with x € A, 
transverse both to the a-fibers and to the (-fibers, 
such that the maps a and 8, when restricted to A, 
are diffeomorphisms of A onto open subsets a(A) 
and B(A) of To, respectively. For any point x € M, 
there exist bisections through x. A bisection A 
allows us to define two smooth diffeomorphisms 
between open subsets of T, denoted by L4 and Rag 
and called the left and right translations by A, 
respectively. They are defined by 


La : a (8(A)) > o^ (a(A)) 
La(y) = m( Bl, o aly), y) 
and 
Ra : 8 (a(A)) > 8 (B(A)) 
Ra(y) = m(y ala o 8ty) 


[he definitions of the target and source maps for 
T*T rest on the following properties. Let x be a 
point in I and A be a bisection through x. The two 
vector subspaces, Tawo and ker T,,4,8, are com- 
plementary in TaI. For any v € Ty. lr, uv — T (v) 
is in ker T4442. quiste. R4 maps the fiber 
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3-"(a(x)) into the fiber 3^! (8(x)), and its restriction 
to that fiber does not depend on the choice of A; it 
depends only on x. Therefore, TR4(v — T 9(v)) is in 
ker T..3 and does not depend on the choice of A. We 
can define the map à by setting, for any € € T;T and 
any v € Taal, 


(a(&),v) = (€, TRA(v — T8(v))) 


Similarly, we define 3 by setting, for any £ € JL 
and any w € T44T, 


(AE), w) = (& TLa(w — Ta(w))) 


We see that @ and 9 are unambiguously defined, 
smooth, and take their values in the submanifold 
AF To of T*T. They satisfy 


TpoQ-aomy. mro B= Borr 


where mr:T'T —I is the bundle 
projection. 

Let us now define the composition law m on T'T. 
Let €€ T?T' and n € T7T be such that (£) 2 a(n). 


This implies : 


cotangent 


B(x)—o(y) Let A be a bisection 
through x and B a bisection through y. There exist 
a unique £j, € Thilo and a unique mpg € T5, To 
such that 


E = (La | ) t3) T a bia 
n= (Rg! (8(£)) + ima 


Then 7z(£, ) is given by 
AlE, n) = aba + Biya + (Ra Y (La!) (809) 


We observe that in the last term of the above expression 
we can replace 9(£) by a(7)), since these two expressions 
are equal, and that (Ro V (LEV 2 (L4) (R4! )', since 
Rp and L4 commute. j 

Finally, the inverse 7 in TF is 4*. 

With its canonical symplectic form, T*T —A'T» is 
a symplectic groupoid. When the Lie groupoid I is a 
Lie group G, the Lie groupoid T*G is not a Lie 
group, contrary to what happens for TG. This shows 
that the introduction of Lie groupoids is not at all 
artificial: when dealing with Lie groups, Lie group- 
oids are already with us! The set of units of the 
Lie groupoid T*G can be identified with G' (the 
dual of the Lie algebra G of G), identified itself with 
T} G (the cotangent space to G at the unit element e). 
The target map à:T*G — T;G (resp. the source 
map ĝ:T*G — T;G) associates to each ge G 
and € € T7G, the value at the unit element e of the 
rght-invariant l-form (resp. the  left-invariant 
1-form) whose value at x is £. 


Poisson Lie groups as Poisson groupoids Poisson 
groupoids were introduced by Alan Weinstein as a 
generalization of both symplectic groupoids and Poisson 
Lie groups. Indeed, a Poisson Lie group is a Poisson 
groupoid with a set of units reduced to a single element. 


Lie Algebroids 


The notion of a Lie algebroid, due to Jean Pradines, is 
related to that of a Lie groupoid in the same way as the 
notion of a Lie algebra is related to that of a Lie group. 


Definition 6 A Lie algebroid over a smooth 
manifold M is a smooth vector bundle z:A — M 
with base M, equipped with 


(i) a composition law (51,52) — {s1, s2} on the space 
[ * (4) of smooth sections of 7, called the bracket, 
for which that space is a Lie algebra; and 

(ii) a vector bundle map p: A — TM, over the identity 
map of M, called the anchor map, such that, for all 
sı and s; € l'*(z) and all f € C*(M, R), 


{si,fs2} = f{s1,s2} + ((p051) - f)s2 [17] 


Examples 


Lie algebras A finite-dimensional Lie algebra is a 
Lie algebroid (with a base reduced to a point and the 
zero map as anchor map). 


Tangent bundles and their integrable sub-bundles A 
tangent bundle ty: TM — M to a smooth manifold 
M is a Lie algebroid, with the usual bracket of 
vector fields on M as composition law, and the 
identity map as anchor map. More generally, any 
integrable vector sub-bundle F of a tangent bundle 
TM:T1M — M is a Lie algebroid, still with the 
bracket of vector fields on M with values in F as 
composition law and the canonical injection of F 
into TM as anchor map. 


The cotangent bundle of a Poisson manifold Let 
(P,II) be a Poisson manifold. Its cotangent bundle 
zp:1'P — P has a Lie algebroid structure, with 
II: T*P — TP as anchor map. The composition law 
is the bracket of 1-forms. It will be denoted by 
(n, C) — [ņ, C] (in order to avoid any confusion with 
the Poisson bracket of functions). It is given by the 
formula, in which 7 and Ç are 1-forms and X a 
vector field on P: 


(In. c], X) = Hn, dl, XY) + (dln, X). C) 
+ (ECX)ID (n. C) 18) 


We have denoted by Z(X)II the Lie derivative of 
the Poisson structure II with respect to the vector 
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field X. Another equivalent formula for that 
composition law is 


kn = Ln — Ln- d(I(G,m)) — [19] 


The bracket of 1-forms is related to the Poisson 
bracket of functions by 


[df.dg| = d{f.g} forall fandgeC*(P,R) [20] 


Properties of Lie Algebroids 


Let 7:A be a Lie algebroid with anchor map 
p:A — TM. 


A Lie algebras homomorphism For any pair (si, s2) 
of smooth sections of 7, 


po 151,52] = [p © s1, po s2] 


which means that the map s—— pos is a Lie algebra 
homomorphism from the Lie algebra of smooth 
sections of m into the Lie algebra of smooth vector 
fields on M. 


The generalized Schouten bracket The composi- 
tion law (s1,52) — {s1,s2} on the space of sections of 
7 extends into a composition law on the space of 
sections of exterior powers of (A,m, M), which is 
called the “generalized Schouten bracket.” Its 
properties are the same as those of the usual 
Schouten bracket. When the Lie algebroid is a 
tangent bundle ty: TM — M, that composition law 
reduces to the usual Schouten bracket. When the Lie 
algebroid is the cotangent bundle mp: T*P — P to a 
Poisson manifold (P,II), the generalized Schouten 
bracket is the bracket of forms of all degrees on the 
Poisson manifold P, introduced by J-L Koszul, 
which extends the bracket of 1-forms used earlier. 


The dual bundle of a Lie algebroid Let cc: A* —^ M 
be the dual bundle of the Lie algebroid 7: A — M. 
There exists on the space of sections of its exterior 
powers a graded endomorphism d,, of degree 1 (that 
means that if ņ is a section of A*A*, d, (1) is a section 
of A**!A*). That endomorphism satisfies 


d,od,-—0U0 


and its properties are essentially the same as those of 
the exterior derivative of differential forms. When 
the Lie algebroid is a tangent bundle 74: TM — 
M,d, is the usual exterior derivative of differential 
forms. 

On the spaces of sections of the exterior powers of 
a Lie algebroid and of its dual bundle we can 
develop a differential calculus very similar to the 
usual differential calculus of vector and multivector 


fields and differential forms on a manifold. Opera- 
tors such as the interior product, the exterior 
derivative, and the Lie derivative can still be defined 
and have properties similar to those of the corre- 
sponding operators for vector and multivector fields 
and differential forms on a manifold. 

The total space A* of the dual bundle of a Lie 
algebroid 7: A — M has a natural Poisson structure: 
a smooth section s of m can be considered as a 
smooth real-valued function on A* whose restriction 
to each fiber a (x)(x € M) is linear; this property 
allows us to extend the bracket of sections of 7 
(defined by the Lie algebroid structure) to obtain a 
Poisson bracket of functions on A*. When the Lie 
algebroid A is a finite-dimensional Lie algebra G, the 
Poisson structure on its dual space G' is the KKS 
Poisson structure discussed earlier. 


The Lie Algebroid of a Lie Groupoid 


Let L'2To be a Lie groupoid. Let A(T) be the 
intersection of ker Ta and Ty, I (the tangent bundle 
TT restricted to the submanifold ro). We see that A(T) 
is the total space of a vector bundle 7: A(T) — Po, 
with base Io, the canonical projection 7 being the map 
which associates a point u € lo to every vector in 
ker T,,a. In this section, we define a composition law 
on the set of smooth sections of that bundle, and a 
vector bundle map p:A(T)— Tro, for which 
7T:À(LT) —DFo is a Lie algebroid, called the Lie 
algebroid of the Lie groupoid T SL 0- 

We observe first that for any point u € To and any 
point x € 3 !(u), the map Ly: ye L,y 2 m(x,y) is 
defined on the a-fiber a^! (u), and maps that fiber 
into the a-fiber a^! (a(x)). Therefore, T,,L, maps the 
vector space A,= ker T,a@ onto the vector space 
ker Txa, tangent at x to the a-fiber a^! (a(x)). Any 
vector w € A, can therefore be extended into the 
vector field along 3^! (u), xı (x) =T,L,(w). More 
generally, let 10: U — A(T) be a smooth section of 
the vector bundle z: A(T) — Ip, defined on an open 
subset U of Io. By using the above-described 
construction for every point z € U, we can extend 
the section w into a smooth vector field w, defined 
on the open subset ^! (U) of T, by setting, for all 
u € U and x € 9^! (uy: 


w(x) az TuLyx(w(u)) 


We have defined an injective map w> w from the 
space of smooth local sections of 7: A(T) — Po, into 
a subspace of the space of smooth vector fields 
defined on open subsets of r. The image of that map 
is the space of smooth vector fields w, defined on 
open subsets U of T of the form U — 3^! (U), where 
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U is an open subset of lro, which satisfy the two 
properties: 


1. Ta oto -—0, " 
2. for every x and y € U such that 8(x)—a(y), 
Ty Lx (t0(y)) = to(xy). 


These vector fields are called “left-invariant vector 
fields” on T. 

The space of left-invariant vector fields on I is 
closed under the bracket operation. We can therefore 
define a composition law (w1, w2) (101, w2} on the 
space of smooth sections of the bundle 7: A(T) — To 
by defining (101,105) as the unique section such that 


[un w2} = [iv , t02] 


Finally, we define the anchor map p as the map T8 
restricted to A(T). With that composition law and 
that anchor map, the vector bundle 7: A(T) — To is 
a Lie algebroid, called the Lie algebroid of the 
Lie groupoid r To. 

We could exchange the roles of a and ( and use 
right-invariant vector fields instead of left-invariant 
vector fields. The Lie algebroid obtained remains the 
same, up to an isomorphism. 

When the Lie groupoid M is a Lie group, its Lie 
algebroid is simply its Lie algebra. 


The Lie Algebroid of a Symplectic Groupoid 


Let [=P be a symplectic groupoid, with symplectic 
form w. As we have seen above, its Lie algebroid 
7:À — D is the vector bundle whose fiber, over 
each pons u € lo, is ker Ta. We define a linear 
map w, : ker Ta — TT by setting, for each w € 
ker T.o and v € T 


(uj (w). v) = wy(v, tw) 


Since T,Io is Lagrangian and ker T,o complemen- 
tary to T,Llo in the symplectic vector space 
(T,U,c(u), the map w, is an isomorphism from 
ker T,o onto T;lo. By using that isomorphism for 
each u € l'o, we obtain a vector bundle isomorphism 
of the Lie algebroid 7: A — I onto the cotangent 
bundle TIS * T*T m Io. 

As seen in Corollary 5, the submanifold of units To 
has a unique Poisson structure II for which a: T — To 
is a Poisson map. Therefore, the cotangent bundle 
tr, : T'T9 — To to the Poisson manifold (To, IT) has a 
Lie algebroid structure, with the bracket of 1-forms as 
composition law. That structure is the same as the 
structure obtained as a direct image of the Lie 
algebroid structure of 7: A(T) — Fo, by the above- 
defined vector bundle isomorphism of 7:A — To 
onto the cotangent bundle zy, : T*T'; — Ip. The Lie 


algebroid of the symplectic groupoid rSTo can 
therefore be identified with the Lie algebroid 
Tr, : T^To — To, with its Lie algebroid structure of 
cotangent bundle to the Poisson manifold (I, I). 


The Lie Algebroid of a Poisson Groupoid 


The Lie algebroid 7: A(T) — T of a Poisson group- 
oid has an additional structure: its dual bundle 
z:A(L) — Fo also has a Lie algebroid structure, 
compatible in a certain sense (indicated below) with 
that of *: A(T} — To. 

The compatibility condition between the two Lie 
algebroid structures on the two vector bundles in 
duality 7: A — M and w:A* — M can be written as 
follows: 


d,[X, Y] = £(X)d,Y — L(Y)d.X [21] 


where X and Y are two sections of 7, or, using the 
generalized Schouten bracket of sections of exterior 
powers of the Lie algebroid 7: A — M, 


d,[X, Y] = [d.X, Y] + [X, d, Y] [22] 


In these formulas d, is the generalized exterior 
derivative, which acts on the space of sections of 
exterior powers of the bundle 7: A — M, considered 
as the dual bundle of the Lie algebroid cz: A* — M. 

These conditions are equivalent to the similar 
conditions obtained by exchange of the roles of A 
and A*. 

When the Poisson groupoid [= 3T 0 iS a symp- 
lectic groupoid, we have seen that its Lie algebroid is 
the cotangent bundle zp, :T*To — [9 to the Poisson 
manifold I’) (equipped with the Poisson structure for 
which a is a Poisson map). The dual bundle is the 
tangent bundle 7,: TT'; — To, with its natural Lie 
algebroid structure defined earlier. 

When the Poisson groupoid is a Poisson Lie group 
(G, IT), its Lie algebroid is its Lie algebra G. Its dual 
space G has a Lie algebra structure, compatible with 
that of G in the above-defined sense, and the pair 
(C, G^) is called a Lie bialgebra. 

Conversely, if the Lie algebroid of a Lie groupoid 
is a Lie bialgebroid (i.e., if there exists on the dual 
vector bundle of that Lie algebroid a compatible 
structure of Lie algebroid, in the above-defined 
sense), that Lie groupoid has a Poisson structure 
for which it is a Poisson groupoid. 


Integration of Lie Algebroids 


According to Lie's third theorem, for any given 
finite-dimensional Lie algebra, there exists a Lie 
group whose Lie algebra is isomorphic to that 
Lie algebra. The same property is not true for Lie 
algebroids and Lie groupoids. The problem of 


finding necessary and sufficient conditions under 
which a given Lie algebroid is isomorphic to the Lie 
algebroid of a Lie groupoid remained open for more 
than 30 years, although partial results were 
obtained. A complete solution of that problem was 
recently obtained by M Crainic and R L Fernandes. 
Let us briefly sketch their results. 

Let x: A — M bea Lie algebroid and p: A — TM its 
anchor map. A smooth path a: I= [0, 1] — A is said to 
be admissible if, for all ? € I, p o a(t) — (d/dt)(x o a)(t). 
When the Lie algebroid A is the Lie algebroid of a Lie 
groupoid I’, it can be shown that each admissible path 
in A is, in a natural way, associated to a smooth path in 
[ starting from a unit and contained in an a-fiber. 
When we do not know whether A is the Lie algebroid 
of a Lie groupoid or not, the space of admissible paths 
in A still can be used to define a topological groupoid 
G(A) with connected and simply connected a-fibers, 
called the Weinstein groupoid of A. When G(A) is a Lie 
groupoid, its Lie algebroid is isomorphic to A, and 
when A is the Lie algebroid of a Lie groupoid T, G(A) is 
a Lie groupoid and is the unique (up to an isomorph- 
ism) Lie groupoid with connected and simply con- 
nected a-fibers with A as Lie algebroid; moreover, G(A) 
Is a covering groupoid of an open sub-groupoid of T. 
Crainic and Fernandes have obtained computable 
necessary and sufficient conditions under which the 
topological groupoid G(A) is a Lie groupoid, that is, 
necessary and sufficient conditions under which A is 
the Lie algebroid of a Lie groupoid. 
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Liquid crystals represent an important state of matter, 
intermediate between regular solids with long-range 
positional order of atoms or molecules (often accom- 
panied by the orientational order, as in the case of 
molecular crystals) and isotropic fluids with neither 
positional nor orientational long-range order. The 
basic feature of liquid crystals is orientational order of 
building units, which might be individual molecules or 
their aggregates, and complete or partial absence of the 
long-range positional order. Molecular interactions 
responsible for orientation order in liquid crystals are 
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relatively weak (most liquid crystals melt into the 
isotropic phase at around 100-150°C). As a result, 
the structural organization of liquid crystals, most 
importantly, the direction of molecular orientation, 
is very sensitive to the external factors, such as 
electromagnetic field and boundary conditions. This 
sensitivity opened the doors for applications of 
liquid crystals, including in information displays 
and flat-panel TVs. 

Liquid crystals, discovered more than 100 years 
ago, represent nowadays one of the best studied 
classes of soft matter, along with colloids, polymer 
solutions and melts, gels and foams. There is 
an extensive literature on physical phenomena in 
liquid crystals, their chemical structure and material 
parameters, display applications, etc. 


Thermotropic and Lyotropic Systems 


Depending on the way the liquid crystalline state 
(also known as “mesophase”) is produced, one 
distinguishes. thermotropic and lyotropic liquid 
crystals. Thermotropic liquid crystalline state can 
exist In a certain temperature range for the materials 
made of strongly anisometric molecules, either 
elongated (calamitic molecules) or disk-like (discotic 
molecules). Upon heating, many substances of this 
type yield the following phase sequence: solid 
crystal-liquid crystal-isotropic fluid. 

Lyotropic liquid crystals form only in the presence of 
a solvent, such as water or oil. Most commonly, 
lyotropic mesophases are formed by solutions of 
anisometric amphiphilic molecules (such as soaps, 
phospholipids, and surfactants). Amphiphilic molecules 
have two distinct parts: a (polar) hydrophilic head and a 
(nonpolar) hydrophobic tail (generally, an aliphatic 
chain). This feature gives rise to a special “self- 
organization" of amphiphilic molecules in solvents. 
Mesomorphic states also might be formed in the 
solutions of certain polymers; polymers might also 
form thermotropic (solvent-free) liquid crystals. 

There are four basic types of liquid crystalline phases, 
classified according to the dimensionality of the trans- 
lational correlations of building units: nematic (no 
translational correlations), smectic (1D correlations), 
columnar (2D correlations), and various 3D-correlated 
structures, such as cubic phases and blue phases. 

“Uniaxial nematic,” noted UN, is an optically 
uniaxial fluid phase. The unit vector along the optic 
axis is called the director n, n* — 1; it indicates the 
average orientation of the molecular axes (see 
Figure 1). Even when the molecules are polar, 
head-to-head overlapping and flip-flops establish 
centrosymmetric arrangement in the nematic bulk. 
Thus, n and —n are equivalent notations. It is 


(b) 

Figure 1 (a) Nematic (uniaxial) type of ordering in thermotropic 
liquid crystals; the molecular long axes are on average aligned 
along the director m (b) a molecule of octylcyanobiphenyl, a 
typical thermotropic liquid crystalline material capable of both 
nematic and SmA types of ordering. 
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important to realize that m specifies only the 
direction. of orientation but not the degree of 
orientational order. In biaxial nematics (BN), the 
symmetry point group is one of a prism. A BN 
phase is characterized by three directors, n, l, and 
m=n x lL, such that n = =ni = =l, and m = =m; 

When the building unit (molecule or aggregate) is 
chiral, that is, not equal to its mirror image, UN 
might show a helicoidal structure. It is then called a 
cholesteric phase denoted Ch or N*. Note that UN, 
BN, and N* phases are liquid phases (no long-range 
correlations in molecular positions). 

"Smectics" are layered phases with a quasi-long- 
range 1D translational order of centers of molecules 
in a direction normal to the layers (see Figure 2). 
This positional order is not exactly the long-range 
order as in regular 3D crystals: as shown by Landau 
and Peierls, the fluctuative displacements of layers in 
1D lattice diverge logarithmically with the size of 
the sample. However, for regular materials with 
smectic period of the order of 1 nm, the effect is 
noticeable only on scales of 1mm and larger. In 
smectic A (SmA), the molecules within the layers 
show fluid-like arrangement, with no long-range 
in-plane positional order; it is a uniaxial medium 
with the optic axis perpendicular to the layers (see 
Figure 2). Some materials, such as octylcyanobiphe- 
nyl (see Figure 1b), show both UN and SmA phase 
(at somewhat lower temperatures). In the lyotropic 
version of SmA, the so-called lamellar L, phase, the 
amphiphilic molecules arrange into bilayers. If the 
solvent is water, the exterior surfaces of the bilayer 
are formed by polar heads; the hydrophobic tails are 


Water 


Water 


Water 


Thermotropic SmA 


Lyotropic L, phase 


Figure 2 SmA type of ordering in the thermotropic SmA liquid 
crystal (left) and the lyotropic analog, La phase (right) formed by 
equidistant arrangement of amphiphilic bilayers in water. 
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hidden in the middle of the bilayer (note that 
membranes of many biological cells are organized 
in the similar way). The periodic structure of 
alternating surfactant and water layers gives rise to 
the L, phase (see Figure 2). Interestingly, the 
structure might retain its smectic ordering even 
when strongly diluted, being stabilized by thermal 
fluctuations of bilayers. 

Other types of smectics show in-plane order, 
caused, for example, by a collective tilt of the rod- 
like molecules with respect to the normals to the 
layers (the so-called SmC). In chiral materials, the 
tilt of the molecules might lead to the helicoidal 
structure; we do not consider them here, although 
the chiral SmC phase is of considerable interest for 
applications in fast-switching optical devices. 

*Columnar phases" are most frequently formed 
by hexagonal packing of cylindrical aggregates, as in 
the case of thermotropic materials formed by disc- 
like molecules. The positional order is 2D only, as 
the intermolecular distances along the axes of the 
aggregates are not regular. 

“3D-correlated structures" demonstrate a periodic 
structure along all three coordinates, but they are 
still different from the 3D crystals, as the periodicity 
is caused by the repetition of molecular orientations 
rather than by regular repetition of the molecular 
centers of mass. For example, in cubic lyotropic 
phases, the 3D network is formed by periodically 
curved layers of amphiphilic molecules; the mol- 
ecules are free to move within the layers. 


Order Parameter 


The concept of an order parameter (OP) has 
emerged in its modern form in the Landau model 
of phase transitions and has been later expanded to 
describe other features such as topologically stable 
defects in the ordered media. The OP of the liquid 
crystal can be related to the anisotropy of macro- 
scopic properties such as diamagnetic or dielectric 
susceptibility. Measuring these anisotropies allows 
one to determine the degree of orientational order. 
The magnetic measurements are especially conveni- 
ent compared with their electric counterparts, as in 
this case the local field acting on the molecules 
differs very little from the external field. In UN, the 
components of the (symmetric) magnetic suscepti- 
bility tensor y read in the frame in which the z-axis 
is parallel to the director n, as 


7 x. 0 0 
x=| 0 x 0 [1] 
0 0 X |l 


The quantity y; — xj — xi is called the anisotropy 
of the magnetic susceptibility. In most thermotropic 
UNS, x; < 0 and y, < 0 (diamagnetism), and x, > 0, 
so that n orients along the applied magnetic field. In 
the isotropic phase, Xa = 0; in UN, x, is determined by 
(1) molecular susceptibilities of individual molecules 
and (2) degree of molecular order. For the latter, one 
can chose the temperature-dependent quantity 
s(T) —(1/2)(3cos*^ 0 — 1), where 6 is the angle 
between the axis of an individual molecule and the 
director n and (...) means an average over molecular 
orientations. The OP is thus the traceless symmetric 
tensor O with the components that vanish in the 
isotropic phase, and are proportional to y, in the UN 
phase: 


ES =¥,/3 0 0 
y 0  2x,/3 


One can choose the constant O in such a way 
that in an arbitrary coordinate system, where 


Xi; = X105 + Xaftift;, 
Qi —s(T) (nin; — $65) [3] 


The tensor OP allows one to describe the biaxial 
nematic phase as well: 


Oj =s(T) (nin; — $65) + b(T)(ll; — mim;) [A] 


where n, l, and m are three orthogonal directors and 
b is the “biaxiality parameter”; b — 0 in UN. 


Elasticity of the Nematic Phase 


In real samples of liquid crystals, the average 
molecular orientation changes from point to point 
because of the external fields, boundary conditions, 
presence of foreign particles, etc. The OP becomes 
spatially nonuniform, O;(r). In most problems of 
practical interest, the typical scale of distortions is 
much larger than the molecular scale; the deforma- 
tions are weak in the sense that the scalar part of the 
OP, s(T), remains constant despite the spatial 
gradients of the director field n(r). 

The free-energy density associated with the (small) 
deformations of the UN, classified as splay, twist, 
and bend of the director (see Figure 3) writes in 
terms of the director gradients n; ; =(Onj;/Ox;) as 


fro = 1Ki (divn)* +4K 


s(n - curl st)? 
+4K;(n x curl n) 


[5] 


and is known as the Frank-Oseen energy density with 
Frank elastic constants of splay (Kı), twist (K2), and 
bend (K3); all three are necessarily positive definite; the 


Figure 3 Basic types of director distortions in the bulk of the 
uniaxial nematic. 


dimensionality is that of a force. The elastic constants 
can be estimated as the typical energy of molecular 
interactions responsible for the orientational order 
divided by the characteristic length (a molecular size): 
K ^ U/I ~ kgT/l ^ 4 x 10?!]/10? — 4pN, which 
yields a good estimate for many thermotropic UNs, 
as the experimental values are between 1 and 10 pN. 
The energy density [5] is often supplemented with the 
so-called divergence terms: 


fis + h4 = K43 div(n div n) 
— K»4 div(ndivn +n x curl) [6] 


The K54 term can be re-expressed as a quadratic 
form ot the first derivatives whereas the K,3 term is 
proportional to the second derivatives 7; ; and thus 
might in principle be comparable to fro ~ n; jng.. 
The volume integrals of these terms can be 
re-expressed as the surface integrals by virtue of 
the Gauss theorem (but only when the elastic moduli 
Kı3 and K54 are constant ; which might not be the 
case at certain interfaces and at the core of defects). 
Therefore, when one seeks for equilibrium director 
configurations by minimizing the total free-energy 
functional [(fro +fi3 4- fa4)dV, the Kı; and K»4 
terms do not enter the Euler-Lagrange variational 
derivative for the bulk. However, they can 
contribute to the energy and influence the equili- 
brium director through boundary conditions at the 
surface. Usually, K»4 term is retained when the 
system experiences a topological change of the 
director field. The Kı} term is often neglected; 
very little is known about K;5 value. 

In the presence of external field, the free-energy 
density acquires additional terms. For example, for 
the magnetic field B, the energy density [5], [6] should 
be supplemented by the term —(1/2)u5! x,(B : n)^, 
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where jio — 4x x 1077 Hm! is the magnetic perme- 
ability of free space (magnetic constant). 

The possibility to orient the director by an applied 
electric or magnetic field leads to numerous practical 
applications. Any actual liquid crystal cell is 
confined; say, by a pair of parallel glass plates. The 
molecular interactions between the liquid crystal 
and the boundary substrates are anisotropic. This 
anisotropy establishes one (sometimes more) pre- 
ferred orientation of # at the boundary, the so-called 
"easy axis." The phenomenon is called the “surface 
anchoring.” Orienting action of the substrates 
usually keeps the director uniform if the external 
field is absent. However, the external field can 
overcome both the *anchoring" at the surfaces and 
the elasticity of the nematic bulk and reorient the 
director. This is the “Frederiks effect," first dis- 
covered for the magnetic case. When the field is 
removed, the surface anchoring restores the original 
director structure. Thus, one can use the external 
field and surface anchoring to switch the liquid 
crystal orientation. back and forth. The dielectric 
version of the effect is used in electrooptic devices, 
including displays. The liquid crystal is usually 
sandwiched between two transparent electroconduc- 
tive plates (e.g., glass covered with indium tin oxide) 
coated with a suitable alignment layer. The voltage 
across the cell controls the director configuration 
and thus the optical properties of the cell. 


Elasticity of the Smectic A Phase 


For the SmA phase, the elastic free-energy density 
should be modified to take into account (1) 
restrictions that the layered structure imposes onto 
the director twist and bend, and (2) elastic cost of 
changes in the thickness of the layers: 


f — i1Ki(div n)^ + 1B? [7] 


where B is the Young modulus (layers compressi- 
bility modulus) and y=(d—do)/do, the relative 
difference between the equilibrium period dọ and 
the actual layer thickness measured along the 
director n. The ratio of K; to B defines an important 
length scale 


A= yKi/B [8] 


called “the penetration length”; À is of the order 
of the layer separation but diverges when the 
system approaches the SmA-nematic transition. 
The splay constant K; in the SmA phase is of the 
same order as in a nematic phase stable at higher 
temperatures. With A z do z (1 + 3) nm, one finds 
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B ~ 10° + 107 N/m^, a value that is 10? to 10* times 
smaller than the compressibility modulus in a solid. 

The SmA elastic free-energy density is often 
written in terms of the mean curvature 
H=(1/2)(o,; +02) and the Gaussian curvature 
G — 010» of the layers: 


f= iK; (o, + 03Y.--Ko163 + 1B4* [9| 


As compared with eqn [7], it is supplemented by the 
divergence saddle-splay term K; —2K, < K < 0 (for 
the system of flat layers to be energetically stable); 
gı — 1/R, and o; — 1/R» are the local values of the 
principal curvatures of the smectic layers. 


Dynamics 


Liquid crystals are fluids; they can flow preserving 
the orientational order. Flow imposes an orienta- 
tional torque on the liquid crystals. Most often, the 
director tends to realign along the direction of flow. 
There is also an inverse effect: director distortions 
can cause the flow. This “backflow” effect is of 
importance in liquid crystal displays. In the approxi- 
mation of a constant scalar OP, the hydrodynamics 
of liquid crystals is described in terms of seven 
unknown variables: (1) mass density p(r,t), (2) three 
components of the velocity field v(r,t), (3) energy 
density, and (4) two components of the director field 
n(r,t). These variables are found from seven 
equations 


1. conservation of mass, 

2. three equations for the conserved components of 
the linear momentum, 

3. entropy balance equation, and 

4. two director dynamics equations. 


In contrast to an isotropic fluid, the stress tensor 
depends not only on the gradients of the velocity, 
but also on the director components. UN phase 
should be characterized by five different viscosity 
constants. The number of viscosities reduces to 
three, when the director distortions are small. 
These three can be chosen as the effective viscosities 
for three idealized geometries of flow, also known as 
Miezowicz geometries, in which one assumes that 
the director is fixed (e.g., by a strong magnetic field) 
(see Figure 4): 

When n=(1,0,0) is perpendicular to both the 
flow direction and the velocity gradient, the UN 
behaves as an isotropic fluid with a viscosity na; 
however, director fluctuations coupled with the 
certain values of the viscosity coefficients might 
destabilize the initial director orientation (see 
Figure 4a). When 7 is parallel to the flow 


Figure 4 Miezowicz geometries for effective viscosities of the 
uniaxial nematic. 


(Figure 4b) or parallel to the velocity gradient 
(Figure 4c), the corresponding viscosities 7, and ne 
are generally different from 7, and from each other; 
Nb <a € Ne for a typical thermotropic UN material 
composed of the rod-like elongated molecules. The 
result 7, < ne can be explained by assuming that 
the friction correlates with the cross section of the 
molecules seen by the flow. 


Topological Defects 
Experimental Observations 


When a thick UN sample (say, 100 jum thick) with 
no special aligning layers is viewed under the 
microscope, one usually observes a number of 
mobile flexible lines, the so-called disclinations. 
The disclinations are seen as thin and thick threads 
(see Figure 5). Thin threads strongly scatter light and 
show up as sharp lines. These are truly topologically 
stable defect lines, along which the nematic sym- 
metry of rotation is broken. The disclinations are 
topologically stable in the sense that no continuous 
deformation can transform them into a uniform 
state, n(r) — const. Thin disclinations are singular in 
the sense that the director is not defined along the 
core of the defect line. Thick threads are line 
defects only in appearance; they are not singular 
disclinations. The director is smoothly curved and 
well defined everywhere, except, perhaps, at a 
number of point defects, the so-called hedgehogs 
(see Figure 5). 

In thin UN samples (1-50 um) with the director 
tangential to the bounding plates, the disclinations 
are often perpendicular to the plates. Under 
a microscope with two crossed polarizers, one 
can see the ends of the disclinations as centers 
with emanating pairs of dark brushes (see Figure 6) 
giving rise to the so-called “Schlieren texture." The 
dark brushes display the areas where 7 is either in 
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Figure 5 (a) Thin singular disclinations and thick nonsingular 
threads in the nematic (n-pentylcyanobiphenyle (5CB)) bulk. 
Crossed polarizers; (b, c) typical director configurations asso- 
ciated with thin and thick lines; thick lines are often associated 
with point defects in the nematic bulk — hedgehogs. 


the plane of polarization of light or in the perpendi- 
cular plane. The director rotates by an angle +7 
when one goes around the end of the disclination at 
the surface. Centers with four emanating brushes are 
also observed; they correspond to point defects 
located at the surface, the so-called boojums, (see 
Figure 6). The director undergoes a +27 rotation 
around these four-brush centers. The principal 
difference between the centers with two brushes 
(ends of singular lines) and centers with four brushes 
(surface point defects) can be seen after a gentle shift 
of one of the bounding plates with respect to the 
other. Upon shear-induced separation in the plane of 
observation, the centers with two brushes are clearly 
seen as connected by a singular trace — disclination, 
while the centers with four brushes separate without 
a visible singularity between them. 

The intensity of linearly polarized light coming 
through a uniform UN slab depends on the angle 8 
between the polarization direction and the projec- 
tion of the director n onto the slab’s plane: 


š "E: 
I = Igsin^ 28 sin* fi (Reeff — ms) [10] 


where ly is the intensity of incident light, A is the 
wavelength of the light, Mee is the effective 
refractive index that depends on the ordinary index 
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Figure 6 Schlieren texture of a thin (13um) slab of 5CB. 
Centers with two and four brushes are the ends of singular 
disclinations and point defects — boojums, respectively. Tangen- 
tial director orientation. Crossed polarizers. 


no, extraordinary index ne, and the director orienta- 
tion. Equation [10] allows one to relate the number 
Ik| of director rotations by +27 around the defect 
core, to the number B of brushes: 


k| = B/4 11] 


Taken with a sign that specifies the direction of 
rotation, k is called the “strength of disclination,” 
and is related to a more general concept of a 
topological charge (but does not coincide with it). 
Note that I = 0 when n is perpendicular to the plates 
(so-called homeotropic state), as Me eff =ĦMo. The 
homeotropic state is used as one of the ground 
states in modern flat-panel TV sets. By applying the 
electric field, one tilts the director so that Me eff A Mo 
and the cell (or the corresponding pixel in the liquid 
crystal panel) becomes transparent, 


Nematic Droplets 


When left intact, textures with defects in flat samples 
relax into a more or less uniform state. Disclinations 
with positive and negative k find each other and 
annihilate. There are, however, situations when the 
equilibrium state requires topological defects. 
Nematic droplets suspended in an isotropic matrix 
such as glycerin, water, polymer, etc., (see Figure 7) 
and inverted systems, such as water droplets in a 
nematic matrix are the most evident examples. 
Consider a spherical nematic droplet of a 
radius R and the balance of the surface anchoring 
energy ~W,R? (W, is the surface anchoring 
coefficient), and the elastic energy ~KR;K is 
some averaged Frank constant. Small droplets 
with R << K/W, avoid spatial variations of m at 
the expense of violated boundary conditions. In 
contrast, large droplets, R >> K/W, satisfy 
boundary conditions by aligning n along the 
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Figure 7  Polarizing-microscope texture of spherical nematic 
droplets suspended in glycerin. (a) The director configuration is 
radial and normal to the spherical surface; the inset shows the 
point-defect hedgehog in the center of the droplet. (b) Tangential 
director orientation at the interface results in the bipolar structure 
with two defects-boojums at the poles. The director is twisted 
because of the smallness of the twist elastic constant as 
compared to the splay and bend constants. 


preferred direction(s) at the surface. Since the 
surface is a sphere, the result is the distorted 
director in the bulk, for example, a radial hedgehog 
when the surface orientation is normal (see Figure 7). 
The characteristic radius R is macroscopic (microns), 
as K~10pN and W,~10°-10-°Jm~. Point 
defects in large nematic droplets must satisfy restric- 
tions on their topological characteristics that have 
their roots in the Poincaré and Gauss theorems of 
differential geometry. 


Topological Classification 
of Defects in UN 


The language of topology, or, more precisely, of 
homotopy theory, allows one to associate the 
character of ordering of a medium and the types of 
defects arising in it, to find the laws of decay, 
merger and crossing of defects, to trace out their 
behavior during phase transitions, etc. The key point 
is occupied by the concept “of topological invari- 
ant," also called a “topological charge," which is 
inherent in every defect. The stability of the defect is 
guaranteed by the conservation of its charge. 
Homotopy classification of defects includes three 
steps. 

First, one defines the OP of the system. In a 
nonuniform state, the OP is a function of 
coordinates. 

Second, one determines the OP (or degeneracy) 
space R, that is, the manifold of all possible values 
of the OP that do not alter the thermodynamical 
potentials of the system. In the UN, R is a unit 
sphere denoted S*/Z (also called the projective 
plane RP5) with pairs of diametrically opposite 
points being identical. Every point of $5^/Z; 


represents a particular orientation of n. Since 
= —n, any two diametrically opposite points at 
S*/Z> describe the same state. 

The function z(r) maps the points of the nematic 
volume into $?^/Z;. The mappings of interest are 
those of ;-dimensional “spheres” enclosing defects. 
A line defect is enclosed by a linear contour, i= 1; a 
point defect is enclosed by a sphere, i=2, etc. 

Third, one defines the homotopy groups 7;(R). 
The elements of these groups are mappings of 
i-dimensional spheres enclosing the defect in real 
space into the OP space. To classify the defects of 
dimensionality t in a t-dimensional medium, one 
has to know the homotopy group 7;(R) with 
i=e — 2 L 

Each element of z;(R) corresponds to a class of 
topologically stable defects; all these defects are 
equivalent to one another under continuous 
deformations. The elements of homotopy groups 
are topological charges of the defects. For UN, 
the homotopy group 71(52/Z5)— Z5 — (0, 1/2) is 
composed of two elements; there is thus only one 
class of topologically stable defects (that appear 
as thin singular lines under the microscope, see 
Figure 5) with the addition rules 1/2 4- 1/2—0 
and 1/24-0—1/2 describing interaction of dis- 
clinations. The topological point defects in the 
bulk (hedgehogs) are described by the second 
homotopy group, 72(S*/Z2) = Z —(0,1,2,...), and 
can be labeled by integer topological charges. The 
simplest point defect is a “radial” hedgehog, seen 
in the center of the radial droplet (see Figure 7a). 
Boojums are special point defects that, in contrast 
to hedgehogs, can exist only at the boundary of 
the medium (see Figure 7b). 

The relative stability of stable disclinations 
depends on the Frank elastic constants of splay 
(Ki1), twist (Ko2), bend (K33) and saddle-splay 
(K»4) in the Frank-Oseen elastic free-energy 
density functional; the role of the elastic constant 
K,3 in the structure of defects is not clarified yet. 

Consider the simplest case of “planar” disclina- 
tions with n perpendicular to the line. In this case, 
the K54-term in the line’s energy is zero. Assuming 
K41 = K5; = K33 = K, by minimizing the bulk integral 
of [5], one finds the equilibrium director configura- 
tion around the line of strength £ 


n = {cos[ky + c), sin[ky + c], 0) [12] 


where y= arctan(y/x),x and y are Cartesian coor- 
dinates normal to the line, c is a constant. The energy 
per unit length of a straight planar disclination is 


L 
F,; = rKE^1n— + Fe [13] 


Fc 


where L is the characteristic size of the system, re 
and F, are, respectively, the radius and the energy of 
the disclination core, a region in which the distor- 
tions are too strong to be described by a pheno- 
menological theory. 

The restriction of planar director distortions does 
not allow the model to grasp the crucial difference 
between half-integer and integer k’s. The lines of 
integer k, as already discussed, are fundamentally 
unstable, as the director can be reoriented along the 
axis. This “escape in the third dimension,” is usually 
energetically favorable, since the singular core is 
eliminated. When opposite directions of the 
"escape" meet, a point defect hedgehog is formed, 
as illustrated in Figure 5c. 


Unlike point defects such as vacancies in 
solids, topological point defects in nematics 
cause disturbances over the whole volume. 


The curvature energy of the point defect is 
proportional to the size R of the system. For 


example, for the radial  hedgehog with 
n=(x,y,z)/\/x* +y2 +27, and the hyperbolic 
hedgehog with n=(- x? + y* + 27, 


one finds, respectively, 


Fa = 87 R( Ky) = Koa) + Fa and 


7 Ki;  2K33 , Koa ! 
Fs - SR (A 15 £5 + Ra [14] 


Defects in Smectics 


Layered structure of smectics leads to linear 
defects of positional order, dislocations, in addi- 
tion to disclinations. There is also a special class 
of distortions known as focal conic domains 
(FCDs) that are associated with large-scale cur- 
vatures of layers. Imagine that because of the 
boundary conditions, flow, or the external fields, 
the smectic layers are curved over the scale much 
larger than the thickness of the layers. It is easy 
to see from eqn [9] that the curved layers will 
prefer to maintain their equidistance, as the 
curvature energy is much smaller than the layers 
dilation energy at the large scales of deforma- 
tions. Generally, the family of equidistant curved 
surfaces is associated with the focal surfaces at 
which the principal curvatures diverge. These 
focal surfaces are thus energetically very costly. 
A radical way to reduce the elastic energy would 
be to decrease the dimensionality of the focal 
surfaces, say, by transforming them into lines and 
points. The latter case corresponds simply to a 
system of concentric spherical layers. The former 
is more complicated and corresponds to FCDs in 
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Figure 8 SmA phase with FCDs based on the confocal pairs 
of ellipses and hyperbolas; the scheme on the right shows 
the arrangement of the elliptic bases and smectic layers 
wrapped around the confocal pairs of defects. Reproduced 
from Lavrentovich OD (2003) In: Arodz et al. (eds.) Patterns of 
Symmetry Breaking. Dordrecht: Kluwer Academic Publishers, 
with kind permission of Springer Science and Business Media. 


which the focal surfaces are represented by pairs 
of confocal lines: ellipse and hyperbola (limiting 
case: circle and straight line), and the pair of 
confocal parabolae. Experiments confirm that the 
FCDs are the most frequent type of structural 
deformations in smectic materials see Figure 8. 


Conclusion 


To summarize, over the last few decades, liquid 
crystals transformed from a mysterious and 
curious form of condensed matter into a key 
technological material, thanks to the progress in 
the understanding of their elastic, optical, and 
viscous properties. However, the intrinsic com- 
plexity of these materials still leaves plenty of 
room for further studies, not only of an applied 
nature, but also fundamental. In the field of 
thermotropic liquid crystals, researchers continue 
to discover new types of structural organization, 
such as the phases formed by “banana-shaped” 
molecules that are dramatically different from the 
phases formed by *regular" rod-like and disk-like 
molecules. There is a continuous work to sharpen 
our understanding of even the *old" problems, such 
as mechanisms of surface alignment, nature and 
quantitative values of the elastic constants K13, K24, 
and K. Even in the case of the electric Frederiks 
effect that is at the heart of modern applications, the 
search continues as the corresponding process of 
director reorientation is generally very complex. In 
addition to the dielectric torque, it is controlled by 
various factors, for example, a nonlocal character of 
the electric field in the anisotropic medium, finite 
electric conductivity, flexoelectric effect (i.e., electric 
polarization brought about by the director deforma- 
tions), surface electric polarization at the bounding 
plates, dependence of the dielectric and other 
material properties on the frequency of the applied 
field which might be comparable with the 
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characteristic frequency of dielectric relaxation, cou- 
pling of the director reorientation and the material's 
flows, appearance of topological defects, etc. Many 
research efforts nowadays are focused on composite 
systems, such as liquid crystal colloids and polymer- 
liquid crystal composites. Over the next decade or so, 
one would expect that the emphasis in fundamental 
studies will gradually shift from the thermotropic 
liquid crystals to their lyotropic counterparts, as the 
lyotropic type of orientational order is featured by 
many systems of biological significance, such as 
solutions of DNA, f-actin, etc. 


See also: Non-Newtonian Fluids; Topological Defects 
and Their Homotopy Classification. 
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Introduction 


Using Lagrange multipliers, the smallest and 
the largest eigenvalue of a symmetric quadratic form 


n 
Q(u) = 5 apujug (ajk = apj) 


jk—1 


can be obtained by minimizing and maximizing O 
on the unit sphere $"^! ={u € R":||u|| 2 1). If the 
corresponding extremum is reached at u*, then z* is 
an associated eigenvector. 

In the setting of integral or partial differential 
equations, a "recursive variational method" has 
been proposed to determine all the eigenvalues A; < 
Ao €: € A, and corresponding eigenvectors 

..u" of Q or, in modern terms, of the 


D 23 
ti xs 
associated symmetric matrix A = (aj): 


M = min Olu) (= Qu) 
^i = "-—-— ag ic A) 
(— O(w’)) (j —2.....f) 
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Further considerations have led to a nonrecursive 
minimum-maximum principle: 
Y= 


max O(u) (1<j<n) 


min u 
[X'CR" : dim X’ =j} [uc X! : ||u|| 21) 


and to a dual 
(Weyl): 


maximum-minimum principle 


Aj— max min 
Uni ssp; ER" } ul Laep;0,18imj—1] 


(Ll =<) 


O(u) 


These principles have been widely used in various 
existence and approximation questions of mathema- 
tical physics, and extensions have been made to the 
abstract setting of symmetric bilinear forms in 
Hilbert spaces. 

Around 1930, Ljusternik and Schnirelman have 
extended this theory beyond the frame of quadratic 
forms, replacing O by a differentiable real-valued 
function f and the unit sphere by a finite- 
dimensional compact differentiable manifold M. 
Their aim was the obtention of the "critical points" 
of f on M, that is, the points u € M where the 
differential f'(u) of f at 4 (as a linear functional on 
the tangent space T, M to M) is equal to zero, and of 
the corresponding critical values, that is, the values 
of f at critical points. When M is a sphere, the 
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critical are nontrivial 


equation 


points solutions of the 


f'(u) = du [1] 


for some A€ R (nonlinear eigenvalue problem). 
Ljusternik and Schnirelman have replaced the 
dimension of the vector spaces occurring in 
the minimum-maximum principle for eigenvalues 
by the concept of “category” of a closed set A in a 
topological space X. An early success of their 
approach was the existence of three geometrically 
distinct closed geodesics without self-intersections 
on any compact surface of genus zero. In 1960, 
their theory has been extended to  infinite- 
dimensional manifolds and to other measures of 
the “size” of a set than the category, allowing many 
theoretical developments as well as various 
applications to nonlinear differential equations. 


Ljusternik-Schnirelman Category 


Let X be a topological space (e.g., a normed vector 
space, or a differentiable manifold, or a metric 
space), and A a closed subset of X. The category of 
A in X, catx(A), is the least integer k such that A 
can be written as Uj- A; with A; closed and 
contractible in X, that is, continuously deformable 
in X into a single point. If no such k exists, one sets 
catx(A) — --oc. We write cat(X) for catx(X). For 
example, if X is contractible (in itself), cat(X) — 1. 
This is the case for any normed space X. For the 
hypersphere, catg:(S"~') — 1, but cat($" ^!) — 2. 

The Ljusternik-Schnirelman category satisfies the 
following properties, which are not too difficult to 
prove. If A, B C X are closed, 


l. catx(A) —0 if and only if A = (; 

2. if AC B,catx(A) € caty(B); 

3. catx(A U B) € catx(A) + catx(B); 

4. i£ 9:[0, 1] x X — X is a continuous deformation 
of X(7(0, A) =A), catx(A) € catx((1, A)); and 

5. if X is a finite-dimensional manifold and A C X 
is compact, there is a neighborhood B of A such 
that catx(B) — catx(A). 


Computing or even estimating the category of a 
given set is in general difficult, requiring techniques 
of algebraic topology. In particular, one can show 
that, for the z-torus T” =S! x S! x --- x S! (n times), 
cat( T") = + 1, and for the n-dimensional projective 
space P" — $"/77. obtained by identifying the anti- 
podal points of S", cat(P") — n + 1. It is clear that a 
set of category p must contain at least p points. If X 
is connected, any compact subset of category p + 1 
has (topological) dimension larger or equal to p. 


Ljusternik-Schnirelman Minimax Method 


The Ljusternik-Schnirelman category of M provides a 
lower bound for the number of critical points of a 
smooth function f on suitable finite-dimensional 
manifolds M. Namely, if M is a compact Riemannian 
C?-manifold without boundary, any f € C*(M,R) 
has at least cat(M) distinct critical points, with 
critical values 

c, = inf supf(u) 


€ A, ucA 
where 


= (A C M:A closed, caty(A) > k} 
(1 < k € cat(M)) i3] 


A fundamental technique in the proof is a deformation 
lemma along the trajectories of the gradient system 
associated to f (method of steepest descent). If Vf 
denotes the gradient of f in the Riemannian structure 
of M, the Cauchy problem for the gradient system 


dn 


v —--Vf(y, (0) =u [4] 


has a unique globally defined continuous solution 
n (t, u), which is such that 


f (1.40) TE 


c] IVF, uy)? dr [5] 


(nt. u)) dt 


Notice that, by property (4) of the category, each 
deformation by 7 of a set in A; remains in .A;. For 
c € R, define 


f^ := (ue M:f(u) € c) 
K, := {u € M : Vf (u) =0, f(u) = c} 


From [5] it follows that given c€ R and an open 
neighborhood U. of Ke, one has n(1, f^** \ Ue) C f^ * 
for all sufficiently small £ > 0. This implies that if 
CHm4 4-644 for some g>0, then 
catu(K,) 2 q-- 1l. Assume, by contradiction, that 
catu(K.) € q, let U, be an open neighborhood of K, 
such that caty(U,-) = catu(K.) (U,—0 if 4-0), 2 > 0 
such that 9(1,/*** XV U;) c f**, and A € Aj, such 
that sup, f € c +e, that is, A C f^^. Then 


(1, AN U,)) > caty(A V Uo) 
> caty(A) — caty(U) > j 


[6] 


catm (n 


giving the contradiction c € sup„q,a) f < c— e. 
Notice that, for each j, c; = inf (c € R : caty(f^) > j}, 
which shows that the c; are precisely those levels of f 
where caty(f*) changes. The presence of critical 
values is detected by changes in the topology of the 
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sublevel sets f^ when c varies, a common feature of 
many techniques for finding critical points of 
functions. 

A direct consequence is that for each even 
f € C(R", R), system [1] has at least n pairs of 
solutions (u, —4) with ||4| — 1. Indeed, the solu- 
tions of [1] are the critical points of f on S"^!. As f 
takes the same values at antipodal points, it is well 
defined on the projective space P" ', and 
cat(P" !)— n. 

The Ljusternik-Schnirelman theorem can be extended 
to the C!-situation. The category of M gives a lower 
bound for the number of critical points of f on the closed 
manifold M. If Crit(M) denotes the minimum of 
the number of critical points of all C!-functions on M, 
so that Crit(M) > cat(M), an interesting question is 
to estimate the gap Crit(M) — cat(M). For M closed 
connected, | Crit(M) < dim(M)-- 1 (Takens). If 
Crit(M) — 2, M is homeomorphic to a sphere, so that 
the equality Crit(S) — cat(S) for homotopy spheres is 
equivalent to Poincaré’s conjecture! Manifolds with 
Crit(M)=cat(M)+1 are known, but not with 
Crit( M)  cat(M) + I. 


Ljusternik-Schnirelman Theory 
in Infinite-Dimensional Manifolds 


The main difficulty in extending the results of the 
previous section to functions defined on infinite- 
dimensional manifolds lies in the lack of compact- 
ness. J T Schwartz and Palais have shown that such 
an extension is possible for functions f satisfying on 
M a compactness property (allowing an infinite- 
dimensional deformation lemma), now referred to as 
the Palais-Smale condition: each sequence (u4) with 
(f(u,)) bounded and lim, .4, Vf(uj) —0 has a con- 
vergent subsequence. Such a condition can be 
localized at level c by replacing the boundedness of 
(f(uj)) by limk» f(t) — c. The infinite-dimensional 
extension of Ljusternik-Schnirelman's theorem goes 
as follows: Let M be an infinite-dimensional Rieman- 
nian (or even Finsler) connected complete manifold 
of class C! without boundary. Any f € C'(M,R) 
bounded from below and satisfying Palais-Smale 
condition has at least cat(M) distinct critical points. 

A simple application can be given to the periodic 
solutions of period T (T-periodic solutions) of 
Lagrangian systems 


u^ + VV(u) = h(t) [7] 
where V € C!(R", R), 2z-periodic in each compo- 


nent z;(1 € j € n), h is continuous, T-periodic and 
has mean value h equal to zero. By the least action 


principle, the T-periodic solutions of [7] are the 
critical points of the action functional 


T / 2 
ftu) = J E dM 


on the Hilbert space H7. obtained by completion of 
the space of T-periodic C! functions for the norm 
associated with the inner product 


V(u(t)) + Hem) dt 


T 1 
(u,v) =| u(t) vit dt «| u (t) - v(t) dt 


It follows easily from condition h = 0 that f is bounded 
from below and that f(u + 2e!) =f (u) for all u € H}, 
with e’ the jth unit vector in R"(1 <j € n). Conse- 
quently, we can see f as defined on the Riemannian 


manifold T" x TN where f= = fe € Hi s= 0). It is 
easy to show that cat(T” x HL) = = cat M". =n + 1 and 


that f satisfies Palais-Smale condition on T" x H 1L. 
Consequently, system [7] has at least n + 1 —à 
cally distinct T-periodic solutions. The same result 
holds for the more general systems 


Mu" + Au + VF(u) = b(t) 


occurring in the theory of multipoint Josephson 
junctions or in space discretizations of the 
sine-Gordon equation. In particular, the classical 
forced pendulum equation 


u" +asinu = b(t) 


has at least two geometrically distinct T-periodic 
solutions when / is T-periodic and h=0, a result 
first proved, in a different way, by Mawhin and 
Willem. 

Another way to study nonlinear eigenvalue pro- 
blems of the form 


f (u) = àg (u) 


in a Hilbert or a suitable reflexive Banach space X 
is based upon a Rayleigh-Ritz approximation 
through a sequence of finite-dimensional problems, 
where the classical theory is applied. Conditions 
upon /f,gc C(X,R) are given, generalizing 
Ljusternik-Schnirelman's ones, which ensure the 
existence of infinitely many solutions. Again, some 
compactness is needed to justify the limit process, 
and expressed by some assumptions upon f and g 
too lengthy to be reproduced here. The following 
application is exemplary. Let Q C R^ be a bounded 
domain and X — Wr PO), p > 1, be the Sobolev 
space of functions uw: Q — R obtained as the comple- 
tion of the smooth functions with compact support 


in Q for the norm |u||, fo [Vue (x)? dx)!/P, 
Define the functionals f ut m g on et P(Q) by 


= [ u(x) Pax 
JQ 


The critical points of f on {u € X:g(u)=1} corre- 
spond to the nontrivial solutions of the Dirichlet 
eigenvalue problem 


" j, | Va(x)|Pdx, — g(u) 


Apu = Aul ^u in Q, u-—0onóQ [8] 


for the p-Laplacian operator A, defined by 
V- (Vuc) ^ vux)) 


which occurs in the modelization of various 
problems in a porous medium. An eigenvalue is 
any A€ R such that problem [8] has a nontrivial 
solution. The  Ljusternik-Schnirelman technique 
implies the existence of a sequence of eigenvalues 
going to infinity, with the usual minimax character- 
ization. When N = 1, direct computations show that 
this sequence gives all eigenvalues, but the problem 
remains open for N > 2. The corresponding forced 
problem 


Agu(x) :— 


Apu — Aul ^u = h(x) in Q, u=0 on dQ 


is always solvable (although not uniquely) when A is 
not an eigenvalue, but solvability conditions at the 
higher eigenvalues (Fredholm alternative) remain 
almost terra incognita. 


Index Theories and Critical Points 
of Symmetric Functionals on 
a Banach Space 


Closely related to the Ljusternik-Schnirelman category 
is the concept of index associated to the action of a 
compact topological group G on a normed space X, 
that is, to a continuous map G x X — X,[g,u] — gu 
such that 1-w=u,(gh)u=g(hu), u-— gu is linear. 
The action is isometric if  ||gu||=|lu||, AC X 
is invariant if gA=A for all g€ G,f:X — R is 
invariant if fog=f for all g € G, and h:X ^ X 
is equivariant if goh=hog for each g € G. Let 
Fix G = {u € X : gu =u for all g € G}. The aim of an 
index is to measure the size of invariant sets. 
Explicitly, an index theory associates to each closed 
invariant subset A of X a non-negative (possibly 
infinite) integer G-ind(A), its G-index, such that 
1. G-ind(A) =0 if and only if A = (; 
2. if R:A — B is equivariant and continuous, 
G-ind(A) < G-ind(B); 
3. G-ind(A U B) < G-ind(A) + G-ind(B); and 
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4. if A is compact, there is a closed invariant 
neighborhood U of A such that G-ind(U)— 
G-ind(A). 


A first example of index is Krasnosel’skii’s genus 
or Z -index which corresponds to the action 
O-u=u,1-u = —u of G = Z2. The invariant sets 
are the ones symmetric with respect to the origin 
and Z2-ind(A) is defined by Z5-ind(0) = 0 and, for 
A # ( as the smallest integer k such that there 
exists an odd b € C(A, R^ V 10]). A consequence of 
the Borsuk-Ulam theorem in algebraic topology is 
that any symmetric bounded neighborhood of the 
origin in R” has Z»;-index equal to n. Furthermore, 
for a compact A C R” \ {0} symmetric with respect 
to the origin, and A = A/Z»5 (A with antipodal 
points identified), one has Z5-ind(A) = cate qo) (A). 

A second example, the S'-index, is important in 
the study of periodic solutions of autonomous 
Hamiltonian systems. S'-ind() = 0 and for a non- 
empty closed invariant A C X, S!-ind(A) is defined 
as the smallest integer k such that there exists a 
positive integer n and heC(A, P \ {O}) 
with hog=g"oh for all gc€S!. A Borsuk- 
Ulam-type theorem for S'-equivariant mappings 
implies that if Z is a finite-dimensional invariant 
subspace of X such that Fix S! n Z = {0} and D is 
an open bounded invariant neighborhood of 0 in Z, 
then S'-ind(OD) = (1/2)dim Z. 

As the category of a Banach space X — 1, the 
classical Ljusternik-Schnirelman approach does not 
provide any information about the multiplicity of 
the unconstrained critical points of f € C'(X, R). If f 
is invariant under the action on X of a compact 
group G and satisfies Palais-Smale condition, a 
Ljusternik—Schnirelman minimax method associated 
to a G-index provides multiplicity results for 
unconstrained critical points. Letting 


Aj = {AC X: A is compact, invariant, 
and G-ind(A) > j} 


= int supf ($—1,2,...) 


ACA; A 
one shows as in classical MON: Schnirelman 
theory that if c:= c; = cj41— -=e = cj44 for some f 
and some q > 0, then G-ind(K , > q+ 1. The proof 
uses an equivariant deformation lemma. 


7*- and S'-Invariant Functionals 


In the case of the 7-action, the following multiplicity 
result holds for possibly unbounded even f € C'(X, R) 
satisfying the Palais-Smale condition and having the 
mountain pass geometry: if Y N {u € X:f(u) > 0] is 
bounded for each finite-dimensional subspace Y of X, 
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f(0) — 0, and f(u) »a»O0 on OB(r), then f has 
infinitely many couples of critical points. As an 
application, the semilinear Dirichlet problem 


Au + Xu |u? 'u —0inQ 


| [9] 
u = 0 on ðQ 


has infinitely many solutions when Q C R^ is 
bounded, 1 < p < (N + 2)/(N — 2), and A < A), the 
smallest eigenvalue of —A with Dirichlet boundary 
conditions. The corresponding energy functional, 
defined on Wa (Q) by 


fa) = f 


satisfies the Palais-Smale condition. This condition 
fails in the critical case where p — (N 4- 2)/(N — 2), at 
least at some levels c, and this lack of compactness 
creates both difficulties and interesting phenomena. 
This situation, which occurs in many important 
problems of geometry and physics (harmonic maps, 
Yang-Mills connections, Yamabe problem, equations 
of constant mean curvature, closed geodesics pro- 
blems, etc.), reveals indeed, in physical terms, “phase 
transitions" or “particle creations" at the levels where 
the Palais-Smale condition fails. In the special case of 
eqn [9] with p — (N + 2)/(N — 2), if N > 4, a positive 
solution exists when A € [0, 4i], and, if N=3, the 
same is true for A € [A*, 41] and some A* € [0,4], 
with the optimal value A* = A; /4 when €? is a ball. For 
N > 4, [9] has at least cat(Q) nontrivial solutions 
when A € [0, A**] for some A** < A4. Such a lack of 
compactness, which can also occur for eqn [9] in R^ 
(nonlinear Schródinger equation), is associated to the 
invariance of f with respect to the action of some 
noncompact group, coming, for example, from scale or 
gauge invariance. P L Lions’ concentration-compact- 
ness method is useful to analyze those problems. 

The following multiplicity theorem holds for an 
S'-invariant f € C'(X, R) satisfying Palais-Smale 
condition. Let Fix($!)—[0] and Z be a closed 
invariant vector subspace of X of positive finite 
dimension. If f is bounded from below, f(u) € c « 0 
whenever u € Z and ||4|| 2 r, and f(0) > 0 for u € 
Fix($!) (f£) (0), then f has at least dim Z/2 
distinct S'-orbits of critical points of f with critical 
values less or equal to c. This abstract theorem 
provides multiplicity results for the periodic solu- 
tions (closed orbits) of autonomous Hamiltonian 
systems in R^" 


[Vuo N? — uoo" 


2 2 ser | 


Ju + VH(u) —0 [10] 


where J is the symplectic matrix, H € C'(R?", R), 
and c€ R is such VH(u) 40 for ue H '(c). If 


H! (c) bounds a strictly convex compact set C such 
that Blr] c C C B[R] for some 0 <r<R « v2r, 
then [10] has at least n closed orbits on H! (c). The 
problem is reduced to finding the critical points of a 
suitable dual action functional acting on some space 
X of 2r-periodic functions having mean value zero. 
The S'-action on X is defined by time translations 
[7,u] u, — ( +r) for all y —e" € St. One takes, 
in the abstract result above, Z={(cost)e + 
(sint) Je:e € R^"), so that dim Z =2n. The complete 
proof is quite involved, and, although some 
improvements of Ekeland-Lasry conditions have 
been obtained, the problem remains open to know 
if some pinching condition of the energy surface 
between spheres or ellipsoids is necessary. 


Some Extensions 


When dealing with unbounded functionals, it may 
be convenient to replace the Ljusternik-Schnirelman 
category catx(A) by a relative category caty, y(A) 
with respect to a closed subset Y where, in the 
covering of A occurring in the classical definition, a 
set Ag D Y is added, which is continuously deform- 
able in X into a subset of Y in such a way that 
points of Y remain in Y during the deformation. 
Clearly caty (A) — catx(A). This allows us to prove, 
under some restrictions on the coefficients and the 
period, the existence of at least four periodic 
solutions for the double pendulum with periodic 
forcing of mean value zero. The classical Ljusternik- 
Schnirelman category gives at least three periodic 
solutions without restrictions, and the question of 
their necessity to obtain four solutions is open. 

The relative category also gives a simpler proof of 
Conley-Zehnder's version of the Arnol'd conjecture 
(the existence of at least 22+ 1 geometrically distinct 
1-periodic solutions for the Hamiltonian system 


Ju + VH(t,u) = 0 


with H 1-periodic in each variable), under minimal 
regularity assumptions upon H. The general con- 
jecture, namely that the minimum number of fixed 
points of all Hamiltonian symplectomorphisms of a 
closed symplectic manifold M is larger than the 
minimum number of critical points of smooth 
functions f on M, remains open. 

In another direction, a Ljusternik-Schnirelman 
theory for functionals defined on closed convex sets 
of a Banach space has been developed, which is 
specially well suited for the study of the Plateau 
problem for minimal surfaces, for surfaces of 
constant mean curvature, as well as for variational 
inequalities. 


See also: Bifurcations of Periodic Orbits; Compact 
Groups and Their Representations; Floer Homology; 
Ginzburg-Landau Equation; Inequalities in Sobolev Spaces; 
Minimal Submanifolds; Minimax Principle in the Calculus 
of Variations; Saddle Point Problems; Sine-Gordon 
Equation; Spectral Theory for Linear Operators. 
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Introduction 


Discrete Schródinger operators with quasiperiodic 
potentials are operators acting on (* (77) and defined 


by 
Hy — A-4- AV [1] 
where A is the lattice tight-binding Laplacian 


1, dist(n,m)= 1 
0, otherwise 


Aly; m) = l 


and V(n,m)= V,ó(n,m) is a potential given by 
V, —f(Tt --- T770),0 € T^, where T;0=0+ wi, and 
w is an incommensurate vector. In certain cases A 
may also be replaced by a long-range Laplacian 
L(n,m)=L(n—m) with L(n)—0 sufficiently fast. 
The questions of interest in the study of quasiper- 
iodic and other ergodic operators are the nature and 
structure of the spectrum, behavior of the eigenfunc- 
tions, and the quantum dynamics: properties of the 


time evolution V, — e" Ww, of an initially localized 
wave packet Wo. 

Of particular importance is the phenomenon of 
Anderson localization which is usually referred to 
the property of having pure point spectrum with 
exponentially decaying eigenfunctions. A stronger 
property of dynamical localization (see the section 
“Dynamical localization”) indicates the insulator 
behavior, while ballistic transport, which for d=1 
follows from the absolutely continuous spectrum, 
indicates the metallic behavior. 

Operators with ergodic potentials always have 
spectra (and pure point spectra, understood as closures 
of the set of eigenvalues) constant for a.c. realization of 
the potential. The individual eigenvalues however 
depend very sensitively on the phase. Moreover, the 
pure point spectrum of operators with ergodic 
potentials never contains isolated eigenvalues, so pure 
point spectrum in such models is dense in a certain 
closed set. An easy example of an operator with dense 
pure point spectrum is H,, which is operator [1] with 
\ '=0, or pure diagonal. It has a complete set of 
eigenfunctions, characteristic functions of lattice 
points, with eigenvalues V;. Hy may be viewed as a 
perturbation of H4 for small A. However, since V; 
are dense, small denominators (V; — Vj)! make any 
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perturbation theory difficult, for example, requiring 
intricate KAM-type schemes. 

Various methods developed for the Anderson 
model (where V, are i.i.d.r.v.’s) such as Fröhlich- 
Spencer multiscale analysis and its enhancements, or 
Aizenman-Molchanov method, do not work for 
quasiperiodic potentials as, among other reasons, 
quasiperiodicity does not allow for nice perturba- 
tions. The situation here is more difficult and the 
theory is far less developed than for the random 
case. With a few exceptions, the results are confined 
to the one-dimensional setting, and also the case of 
one frequency (b — 1) has been much better under- 
stood than that of higher frequencies. 

One might expect that H) with A small can be 
treated as a perturbation of Ho =A, and therefore 
have absolutely continuous spectrum. It is not the 
case though for random potentials in d — 1, where 
Anderson localization holds for all A. The same is 
expected for random potentials in d —2 (but not 
higher). Moreover, in one-dimensional case, there 
is strong evidence (numerical, analytical, as well as 
rigorous) that even models with very mild stochas- 
ticity in the underlying dynamics (and sufficiently 
nice sampling functions) have point spectrum for 
all values of A, like in the random case (e.g., 
V, — Af (n" o: + 0), for any o> 1). At the same time, 
for quasiperiodic potentials, one can in many cases 
show absolutely continuous spectrum for À small 
as well as pure point spectrum for A large (see 
below), and therefore there is a metal-insulator 
transition in the coupling constant. lt is an 
interesting question whether quasiperiodic poten- 
tials are the only ones with metal—insulator 
transition in 1D. 


Perturbative and Nonperturbative 
Approaches 


It is probably fair to say that much of the theory of 
qusiperiodic operators has been first developed 
around the almost-Mathieu operator, which is 


Ayo = A+ Af (0 4- nw) [2] 


acting on (Z), with f:T— T;f(0)-— cos (2790). 
Several KAM-type approaches, starting with the 
pioneering work of Dinaburg-Sinai in 1975, were 
developed, in 1980s and 1990s, for this or similar 
models in both large and small coupling regimes. Of 
those, the most robust and detailed is the reduci- 
bility result of Eliasson (1998) that settled the case 
of small couplings for sufficiently regular potentials. 

The common feature of those perturbative 
approaches is that, besides all of them being rather 


intricate multistep procedures, they rely extensively 
on eigenvalue and eigenfunction parametrization 
and perturbation arguments. 

The common feature of the perturbative results in 
the quasiperiodic setting is that they typically provide 
no explicit estimates on how large (or small) the 
parameter A should be, and, more importantly, A 
clearly depends on w at least through the constants in 
the Diophantine characterization of w. 

In contrast, the nonperturbative results allow 
effective (in many cases even optimal) and, most 
importantly, independent of w, estimates on A. The 
latter property (uniform in w estimates on A) has been 
often taken as a definition of a nonperturbative result. 

Recently developed nonperturbative methods are 
also quite different from the perturbative ones in that 
they do not employ multiscale schemes: usually only 
a few (from one to three) sufficiently large scales are 
involved, do not use the eigenvalue parametrization, 
and rely instead on direct estimates of the Green's 
function. They are also significantly less involved, 
technically. One may think that in these latter 
respects they resemble the Aizenman-Molchanov 
method for random localization. It is, however, a 
superficial similarity, as, on the technical side, they 
are still closer to and do borrow certain ideas from 
the multiscale analysis proofs of localization. 


Lyapunov Exponents 


Here for simplicity we consider the quasiperiodic 
case, although the definition of the Lyapunov 
exponents and some of the mentioned facts apply 
more generally to the one-dimensional ergodic case. 

Let d=1. For an energy E€ R the Lyapunov 
exponent ^(E) is defined as 


5 im Jo In IM, (9, E)|d 0 


(E) = lim n [3] 
where 
0 = 
| E — M(wn--0) -—1 
n—k—1 


is the k-step transfer matrix for the eigenvalue 
equation HY — Ev. 

In physics literature, positivity of the Lyapunov 
exponent is often taken as an implicit definition of 
localization, as Lyapunov exponent is often called 
the inverse localization length. Thus, we will be 
interested in the regime when Lyapunov exponents 
are positive for all energies in a certain interval 
intersecting the spectrum. If this condition holds for 
all Ec€ R, there is no absolutely continuous 


component in the spectrum for all 0. Positivity of 
Lyapunov exponents, however, does not imply 
localization or exponential decay of eigenfunctions 
(in particular, neither for the Liouville w nor for the 
resonant 0 € T”). 

Nonperturbative methods, at least in their original 
form, stem to a large extent from estimates invol- 
ving the Lyapunov exponents and exploiting their 
positivity. 

The general theme of the results on positivity of 
^(( E), as suggested by perturbation arguments, is that 
the Lyapunov exponents are positive for large A. 
This subject has had a rich history. The strongest 
result in this general context up to date is the 
following theorem (Bourgain 2003): 


Theorem 1 Let f be a nonconstant real-analytic 
function on T’, and H given by [1]. then, for 
A» A(f), we have ^(E) » (1/2)In A for all E and all 
incommensurate vectors w. 


Corollaries of Positive Lyapunov Exponents 


The almost-Mathieu operator On one hand the 
almost-Mathieu operator, while simple looking, 
seems to represent most of the nontrivial properties 
expected to be encountered in the more general case. 
On the other hand it has a very special feature: the 
duality (essentially a Fourier) transform maps H, to 
H4;,; hence \=2 is the self-dual point. Aubry and 
Andre in 1980, conjectured that for this model, for 
irrational w a sharp metal-insulator transition in the 
coupling constant A occurs at the critical value of 
coupling A — 2: the spectrum is pure point for A > 2 
and purely absolutely continuous for A< 2. This 
conjecture was modified based on later discoveries 
of singular-continuous spectrum in this context for 
frequencies or phases with certain arithmetic proper- 
ties. The modified conjecture stated pure point 
spectrum for Diophantine w and a.e. 0 for \>2 
and pure absolutely continuous spectrum for A < 2 
for all £40. The spectrum at A=2 is singular 
continuous for all w and a.e. (this follows from a 
combination of works by Gordon, Jitomirskaya, 
Last, Simon Avila, and Krikoryan). 

As with the KAM methods, the almost-Mathieu 
operator was the first model where the positivity of 
Lyapunov exponents was effectively exploited 
(Jitomirskaya 1999); 


Theorem 2 Suppose wis Diophantine and ^(E,w) > 0 
for all E € |E1, E2]. Then the almost-Mathieu operator 
has Anderson localization in | Ei, E2] for a.e. 0. 


The condition on 0 can be made explicit (arithmetic) 
and close to optimal. This, combined with the 
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mentioned results on the Lyapunov exponents, 
critical value À — 2, and duality, gives the following 
description in the Diophantine case: 


Corollary 3 The almost-Mathieu operator Ho 0 
bas 


1° for A>2, Diopbantine w € R and almost every 
0 c R, only pure point spectrum with exponen- 
tially decaying eigenfunctions. 

2° for A—2, all wg Q, and ae. OER purely 
singular-continuous spectrum. 

3° for ^ « 2, Diophantine we R and ae. OER, 
purely absolutely continuous spectrum. 


Precise arithmetic descriptions of w,0 are available. 
Thus, the Aubry-Andre conjecture is settled at 
least for almost all &,0. One should mention, 
however, that while 1° can be made optimal by 
existing methods, both 2° and 3° are expected to 
hold for all 0 and all w Z Q, and such extension 
remains a challenging problem (see Simon (2000)). 

The method in the above work, while so far the 
only nonperturbative method available allowing 
precise arithmetic conditions, uses some specific 
properties of the cosine. It extends to certain other 
situations, for example, quasiperiodic operators 
arising from Bloch electrons in a perpendicular 
magnetic field, where the lattice is triangular or 
has next-nearest-neighbor interactions. However, it 
does not extend easily to the multifrequency or even 
general analytic potentials. A much more robust 
method was developed by Bourgain—Goldstein 
(2000), which allowed them to extend (a measure- 
theoretic version of) the above localization result to 
the general real analytic as well as the multi- 
frequency case. Note that essentially no results 
were previously available for the multifrequency 
case, even perturbative. 


Theorem 4 Let f be nonconstant real analytic on 
T^ and H given by [2]. Suppose y(E,w)>0 for 
all E € [E1,E2] and a.e. w € T^. Then for any 6, 
H has Anderson localization in |E;, E2] for a.e. w. 


Combining this with Theorem 1, Bourgain (2003) 
obtained that for A A(f), H as above satisfies 
Anderson localization for a.e. w. Those results were 
recently extended by S Klein to potentials belonging 
to certain Gevrey classes. One very important 
ingredient of this method is the theory of semialge- 
braic sets that allows one to obtain polynomial 
algebraic complexity bounds for certain “excep- 
tional” sets. Combined with measure estimates 
coming from the large deviation analysis of 
(1/7) In ||M,,(@)|| (using subharmonic function theory 
and involving approximate Lyapunov exponents), 
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this theory provides necessary information on the 
geometric structure of those exceptional sets. Such 
algebraic complexity bounds also exist for the 
almost-Mathieu operator and are actually sharp 
albeit trivial in this case due to the specific nature 
of the cosine. 

Further corollaries of positive Lyapunov expo- 
nents for analytic sampling functions f and b=1 
include Holder regularity of the integrated density 
of states, zero-dimensionality of spectral measures 
for all w,@, almost Lipshitz continuity of spectral 
gaps, continuity of measure of the spectrum (in 
frequency), and vanishing of lower transport 
exponents for all 45,0. Some weaker statements are 
available for b> 1 or f belonging to certain Gevrey 
classes. 


Without Lyapunov Exponents 


While having led to significant advances, Lyapunov 
exponents have obvious limitations, as any method 
based on them is restricted to one-dimensional 
nearest-neighbor Laplacians. It turns out that the 
above methods can be extended to obtain nonper- 
turbative results in certain quasi-one-dimensional 
situations where Lyapunov exponents do not exist. 
For example, nonperturbative localization results 
extend to the strip (of arbitrary dimension). 

The following nonperturbative theorem deals with 
the case of small coupling: 


Theorem 5 Let H be an operator |2], where f is 
real analytic on T and w is Diophantine. then, for 
A «€ A(f), H bas purely absolutely continuous spec- 
trum for a.e. 0. 


We note that an analog of this theorem does not 
hold in the multifrequency case (see next section). 
The results of this type are obtained by a method 
(developed by  Bourgain and  Jitomirskaya in 
2000-02) that studies large deviations for the 
quantities of the form (1/z)In|det(H — E)4| and 
path-determinant expansion for the matrix elements 
of the resolvent. Those techniques apply also to 
certain other situations with long-range Laplacians, 
for example, the kicked-rotor model. Theorem 5 is a 
result on nonperturbative localization in disguise as 
it was obtained using duality from a localization 
theorem for a dual model which has in general a 
long-range Laplacian and a cosine potential, and 
was in turn obtained by an extension of the method 
of Jitomirskaya (1999). A certain measure-theoretic 
version of it allowing nonlocal Laplacians but 
leading only to continuous spectrum is also available 
(see Bourgain (2004)). 


Multidimensional Case: d > 1 


As mentioned above, there are very few results in 
the multidimensional lattice case (d > 1). Essentially, 
the only result that existed before the recent 
developments was a perturbative theorem — an 
extension by  Chulaevsky-Dinaburg of Sinai's 
method to the case of operator [1] on /?(Z4) with 
V,—Af(n:w).,w«e R?. where f is a cos-type function 
on T. This also holds nonperturbatively for any real- 
analytic f (see Bourgain (2004)). Note that since 
b=1, this avoids most serious difficulties and is 
therefore significantly simpler than the general 
multidimensional case. We therefore have: 


Theorem 6 For amy c0 there is X(f,c), and, for 
A» A fre), QA, f) C TË with mes(Q) < e, so that for 
wQ, operator |1] with V,, as above bas Anderson 
localization. 


This should be confronted with the following 
theorem of Bourgain: 


Theorem 7 Let d —2 and f (0) = cos220 in H —H,, 
defined as above. Then for any A measure of w s.t. 
H bas some continuous spectrum is positive. 


Therefore, for large A there will be both w with 
complete localization as well as those with at least 
some continuous spectrum. This shows that non- 
perturbative results do not hold in general in the 
multidimensional case! Perturbative results, how- 
ever, had been obtained, see next section. 

A similar (in fact, dual) situation is observed for 
one-dimensional multifrequency (d — 1;b » 1) case 
at small disorder. One has, by duality: 


Theorem 8 Let H be given by [2] with 0,5 € 1 
and f real analytic on T°. Then for any e » 0 there is 
X(f, e) s.t. for A € A(f,e) there is Q(,f) C T^ with 
mes(Q) < e so that for wQ, H bas purely abso- 
lutely continuous spectrum. 


And also 


4b 


Theorem 9 Let d —1,b — 2 and f be a trigonometric 
polynomial on T? with a nondegenerate maximum. 
Then for any A, measure of w s.t. H, bas some point 
spectrum, dense in a set of positive measure, is positive. 


Therefore, unlike the b — 1 case (see Theorem 5), 
nonperturbative results do not hold for absolutely 
continuous spectrum at small disorder. 


Perturbative Localization by 
Nonperturbative Methods 


While the above demonstrates the limitations of 
the nonperturbative results, the nonperturbative 


methods have been applied to significantly simplify 
the proofs and obtain new perturbative results that 
previously had been completely beyond reach. 

Many such applications, that are outside the scope of 
this article, are described in Bourgain (2004). In 
particular, new results on the construction of quasiper- 
iodic solutions in Melnikov problems and nonlinear 
PDEs, obtained by using certain ideas developed for 
nonperturbative quasiperiodic localization (e.g., the 
theory of semialgebraic sets), are presented there. 
Other results in this group contain localization for the 
skew-shift model by Bourgain-Goldstein-Schlag, almost 
periodicity for the quantum kicked-rotor model by 
Bourgain and Bourgain—Jitomirskaya, and localization 
for potentials in higher Gevrey classes by S Klein. 

The main goal in a nonperturbative method is to 
obtain exponential off-diagonal decay for the matrix 
elements of the Green's function of box-restricted 
operators along with subexponential bounds on the 
distance from the spectrum of such box restrictions 
to a given energy. From that result one can obtain 
localization through elimination of energy via an 
argument involving complexity bounds on semialge- 
braic sets (see Bourgain (2004)). 

A nonperturbative way to achieve the desired 
Green's function estimates uses Cramer's rule to 
represent the matrix elements of the resolvent. Then, 
in the one-dimensional (in space) case it is often 
possible to obtain the estimates from the positivity of 
Lyapunov exponents: uniformly for the numerator, 
and from large deviation bounds for the subharmonic 
functions for the denominator. This is done in one 
step for a sufficiently large scale (see the subsection 
"Corollaries of positive Lyapunov exponents") 

A perturbative way consists of establishing the 
desired estimates in a multiscale scheme: namely, the 
estimates are proved outside a set of parameters of 
(subexponentially) decaying (in the size of the box) 
measure. Moreover, this set should be shown to have 
a semialgebraic description, in order to make possible 
sublinear upper bounds on the number of times a 
trajectory of a given phase (under the underlying 
rotation or other ergodic transformation of the torus) 
hits the "forbidden" set. This, plus certain subhar- 
monic function arguments, allows passage to a larger 
scale through a repeated use of the resolvent identity. 

An application that is most relevant to the current 
article is localization for a “true” d> 1 situation. 
The best currently available result is the following 
very recent theorem (Bourgain 2005): 


Theorem 10 Let d — b and let f be real analytic on 
T! such that for all i=1,...,d and (61,...,0; 4, 
izis- --,04) € que tbe map 
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is a nonconstant function of 0; € T. Then for any 
€>0 there is Xf,e) s.t. for A X(f,e) there is 
Q(A, f) C T! with mes(Q) <e so that for wéQ 
operator |1] with V, = M (niwi, nmw) bas Anderson 
(and dynamical) localization. 


This result was obtained previously, for d —2 
only, by Bourgain, Goldstein, and Schlag. There 
were some serious purely arithmetic difficulties that 
prevented an extension of this result to higher 
dimensions. In the previous results on localization 
there were two major steps: estimations on the 
Green's function for fixed energy and elimination of 
energy. The main difficulty in the multidimensional 
case lies in establishing the sublinear bound 
described above, that enters in the first step. It is 
for this bound that an arithmetic condition on w was 
needed. The condition used was to guarantee that 
the number of (mj,"7;) € |1, NI such that (niwi, 
505 (mod Z?) € S is bounded from above by N^? for 
some a < 1, uniformly for all semialgebraic sets S of 
degree D, with D'/D—o(1/N) and with the 
measure of all horizontal and vertical sections S, 
satisfying log mes$, —o(log1/N). This condition 
roughly means that too many points close to an 
algebraic curve of a bounded degree would force it 
to oscillate more than it should. Such a statement is 
essentially two dimensional and not extendable to 
d > 3. In Theorem 10, Bourgain circumvents it by 
using from the beginning the theory of semialgebraic 
sets to eliminate energy and the translation variable 
to get conditions on w (that depend on the potential) 
already in the first step. 


Dynamical Localization 


Anderson localization does not in itself guarantee 
absence of quantum transport, or nonspread of an 
initially localized wave packet, as characterized, for 
example, by boundedness in time of moments of the 
position operator. This was first observed in del Rio 
et al. (1996), where a rather artificial example of 
coexistence of exponential localization and quantum 
transport was constructed. However, such phenom- 
ena also happen in models of interest to physicists 
such as the random dimer model. Considering for 
simplicity the second moment 


p a 
(bp a f SO [ulna ae 


we will say that H exhibits dynamical localization 
if (x^); < const. We will say that the family 
{Ho},-.» exhibits strong dynamical localization if 
J» dé sup,(x*), < const. We note that the results 
mentioned below will hold with more restrictive 
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definitions of dynamical localization (involving the 
higher moments of the position operator) as well. 
Dynamical localization implies pure point spectrum 
by RAGE theorem so it is a strictly stronger notion. 

It turns out that nonperturbative methods allow 
for such dynamical upgrades as well. For the almost- 
Mathieu operator, strong dynamical localization 
holds throughout the regime of localization. It was 
shown by Bourgain and Jitomirskaya that in 
Theorems 4 and 6 as well as some other localization 
results, dynamical localization also holds (see 
Bourgain (2004)). However, methods that require 
elimination of certain frequencies based on implicit 
conditions currently do not provide sufficient infor- 
mation to obtain strong (i.e., averaged) dynamical 
localization, like what was done in the almost- 
Mathieu case. 


Quasiperiodic Localization and Cantor 
Spectrum 


A remarkable feature of quasiperiodic operators 
with b=d=1 is their tendency to have Cantor 
spectrum. In particular, it was conjectured that all 
almost-Mathieu operators (for all nonzero couplings 
and all irrational frequencies) have Cantor spec- 
trum. This conjecture became known as the Ten 
Martini problem. In a significant recent develop- 
ment (Puig 2004), it was shown that for Diophan- 
tine frequencies Cantor structure of the spectrum 
follows from localization for phase 0—0, with 
corresponding eigenvalues being the boundaries of 
noncollapsed gaps. The key idea here is that for 
energies dual to eigenvalues of Ho, corresponding to 
localized eigenfunctions, the rotation number of the 
transfer-matrix cocycle is of the form kw(mod7Z), 
thus they are the ends of the gaps (possibly 
collapsed). However, a collapsed gap in this case 
would correspond to reducibility of the system to 
the identity which can be shown to contradict the 
simplicity of pure point spectrum for the dual 
model. Since those energies form a dense subset of 
the spectrum the result follows. The same idea 
works, thus establishing Cantor spectrum, for 
potentials that are generic in certain sense. Localiza- 
tion also played an important role in the final proof 


of the Ten Martini conjecture, for all irrationals 
(Avila and Jitomirskaya 2005). It can be shown that 
proving localization for a large set of phases allows 
one to conclude reducibility of the transfer-matrix 
cocycle for the dual model, for a large set of 
energies, and this in turn can be shown to contradict 
the presence of an interval in the spectrum. 
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Introduction 


Loop quantum gravity (LQG) is a mathematical 
formalism that defines a tentative quantum theory 
of spacetime. Equally, the formalism provides a 
description of the gravitational field in regimes in 
which its quantum properties cannot be neglected. 
The distinctive feature of LQG is to be a quantum 
field theory consistent with general relativity. 

According to general relativity, the physical fields 
that form the world do not live on a background 
spacetime. Rather, these fields make up spacetime 
themselves (“background independence"). Accord- 
ingly, the quanta of a quantum field theory compatible 
with this principle — the s-knots described below — do 
not live on a background spacetime: rather, they 
themselves form physical spacetime. 

This physical idea is realized in the formalism by 
the gauge invariance under active diffeomorphisms 
of the manifold on which the fields are originally 
defined (“diffeomorphism invariance”). Such gauge 
invariance renders the localization of the field’s 
excitations on the manifold physically irrelevant. 

LQG implements these physical motivations by 
merging two traditional lines of thinking in theoretical 
physics. The first is the long-standing idea that gauge 
fields are naturally understood in terms of variables 
associated to lines (holonomies of the gauge connec- 
tion, Wilson loops, Faraday lines, . . .). This idea can be 
traced to Faraday’s initial intuition that gave birth to 
modern field theory: physical fields are real entities 
formed by lines. The second is the background- 
independent canonical or covariant quantization of 
general relativity developed by following the ideas of 
Wheeler, DeWitt, and Hawking. Each of these two 
lines of research has encountered serious obstructions, 
but the two turn out to solve each others’ difficulties: 
the formulation in terms of holonomies renders the old 
ill-defined background-independent quantum gravity 
well defined; conversely, background independence 
cures the divergences associated to the Wilson loop 
basis. 

The formalism of LQG can be separated into two 
parts. A kinematics, describing the quantum proper- 
ties of space, and a dynamics, describing its 
evolution. Here we outline the LQG kinematics, 
and we give only the main result of the LQG 
dynamics. 
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LQG can be extended to include standard matter 
couplings such as fermions and Yang-Mills fields. It 
finds numerous applications, for instance, in early 
cosmology, astrophysics and black hole thermo- 
dynamics (see Black Hole Mechanics, Quantum 
Cosmology). 

So far no empirical evidence supports the physical 
correctness of this — nor of any other — tentative 
theory of quantum gravity. 


General Relativity in Canonical Form 


Classical general relativity is the field theory 
describing the gravitational field and the structure 
of physical spacetime. It is a well-established 
physical theory, strongly supported empirically. 

In its Riemannian version, the theory can be 
written in canonical form in terms of two fields on a 
three-dimensional (3D) manifold X with coordinates 
x^(a — 1,2,3): a 2-form E = E%e,,, dx^ dx’, called the 
“triad field” and a 1-form A= A,dx^, called the 
“gravitational connection" (e,,, is the totally anti- 
symmetric tensor density). Both take values in the 
su(2) algebra, and they satisfy the three “constraint” 
equations 


G = D,E’ —0 [1] 
C, = tr[F,,E^] = 0 [2] 
C = tr[F,,E7E”] = 0 [3] 


D, is the SU(2) covariant derivative defined by the 
connection A, F,,, is the SU(2) curvature of A, and 
the trace is on su(2). 

E and A are canonically conjugate: their Poisson 
brackets are (E^(x), Ap(y)} = 81Gc ?626" (x, y); where 
G is the Newton constant, c is the speed of light, 67 is 
the Kronecker delta, and 6? (x, y) is the Dirac-delta on 
X, which is a scalar density in x. The Poisson brackets 
of G with the fields define their SU(2) gauge 
transformations: E transforms in the adjoint repre- 
sentation and A transforms as a connection. The 
Poisson brackets of C, (more precisely, of an 
appropriate linear combination of C, and G) with 
the fields determine their transformation under a 
diffeomorphism of X: E transforms as a 2-form and A 
as a 1-form. The Poisson brackets of C with the fields 
generate their coordinate time evolution. If the : 
derivatives of the fields E(x*,t) and A(x*,t) are 
given by their Poisson brackets with (the 3D integral 
of) C, then (assuming that the determinant 


E= /dettr[E?E°] does not vanish) the metric field 


340 Loop Quantum Gravity 


g% — 1, g^? — 0, g” — tr E^E^]/ E is a general solution 
of the Riemannian Einstein equations in a fixed gauge. 

The physical Lorentzian theory can be obtained in 
this formalism in two ways. Either by adding an 
appropriate term to eqn [3], or by taking A in 
sl(2,C) and satisfying a suitable reality condition. 
(For more details, see Canonical General Relativity.) 


Spin Network and s-Knot States 


LQG can be defined as a Schródinger quantization 
of the canonical formalism described above. The 
space of the quantum states is defined as a Hilbert 
space K of Schrödinger wave functionals [A] of the 
gravitational connection. The nontrivial aspect of 
this construction 1s the definition of a scalar product 
invariant under the two kinematical gauge invar- 
iances of the theory: the local SU(2) and the 
diffeomorphisms transformations generated by the 
constraints [1] and [2]. The state space K is defined 
as follows (see Quantum Geometry and its Applica- 
tions for an essentially equivalent construction). 
Given an su(2) connection A and an oriented path 
y:s € [0,1] — x^(s) € X, recall that the “holonomy” 
U[A, y] of A along y is the element of SU(2) defined by 


5 U[A, ^](s) + *(s)As(4(s)) UA. 4](s) 2 O [4] 
U[A, ](0) = 1, U[A,y| = UIA, y7](1) [5] 


where 74(s) = dx^(s)/ds is the tangent to the path. 
The solution of this equation is usually written in 
the form 


BIA, «j| = Pel" i6] 


where the path ordered P is understood as acting on 
the power series expansion of the exponential. 

Let A be the space of the smooth connections A on 
X. (For technical reasons, it is convenient to consider 
smooth fields A defined everywhere in X except at 
most at a finite number of points, and the group 
Diff" of the “extended diffeomorphisms” defined by 
the continuous invertible maps ó:X — X that are 
smooth everywhere in X except at most at a finite 
number of points.) A graph T is an ordered collection 
of smooth oriented paths, y, denoted as links, with 
/=1,...,L, where the links overlap only at their 
endpoints, called nodes. Given a graph T and a 
smooth, Haar-integrable complex function f:U € 
(SU(2))F — f(U) € C, the couple (L,f) defines the 
(“cylindrical”) functional of A 


Vr.r|A] = f(U[A.T]) [7| 


UAT]S(U[A m].....UlA.m] 8] 


Let £ be the linear space of all functionals Yp ;[A], 
for all T and f. £ is dense (in an appropriate sense) in 
the space of all continuous functionals on A. 

An SU(2) and Diff* invariant scalar product can 
be defined in £ as follows. If two functionals 
Yr [A] and Yr, [A] are defined by the same graph 
I’, define 


(Ur Ure) = J dU f (U) g(U) [9] 


where dU is the Haar measure on (SU(2))". The 
extension to functionals defined on different graphs 
is obtained by observing that (IL, f) and (I, f^) define 
the same functional if T contains I" and f is 
independent of the variables in T but not in I’. It 
follows that any two given functionals Yr and 
Piw g can be written as functionals Vp ; and Wr, 
with the same graph I’, where F is obtained from the 
union of I" and T". Using this, the scalar product [9] 
is defined for any two functionals in £: 


(Pr pre gr) = (Url Yre) 10) 


Standard completion in the Hilbert norm defines the 
kinematical Hilbert space K of LQG. £ is dense in K 
and defines the Gelfand triple £ C K C L*. K carries 
a natural unitary representation of the group of local 
SU(2) representations and a natural unitary repre- 
sentation U,, of the group of the extended diffeo- 
morphism of X. These two properties are nontrivial; 
they represent the main physical motivation for the 
definition of the scalar product. The SU(2)-invariant 
subspace of K is a proper subspace Ko. 

An orthonormal basis in Ko can be defined using 
the Peter-Weyl theorem. The basis states are labeled 
by a graph TL, by the assignment of a nonvanishing 
spin j, to each link y € I and by the assignment of a 
basis element 7, in the space of the intertwiners 
(invariant tensors in the tensor product of the 
representations space of the adjacent links) at each 
node s of T. The triple $—(D,7,,2,) 1$ called an 
imbedded spin network. The quantum state 
Ws[A]=(A|S) in Ko labeled by the spin network 
$ — (T, jy, 14) is the cylindrical function obtained by 
contracting the representation matrices of the 
holonomies U(A,^), in the representations ją, with 
the invariant tensors at the nodes. 

The diffeomorphism-invariant state space Kaige is 
the SU(2) and diffeomorphism invariant subspace of 
L'. It is the (closure of the) image of the map 
Pai:£-— C defined by 

(Pait) (Y) = (V^. V^) 
qi" —U,, Vp 


vy, y ck [11 


The sum is over all states V" in £ for which there 
exists a diffeomorphism ó such that V" = UV; this 
is a finite sum. The scalar product on this image is 
naturally defined by 


(Pairs, Page), = (PagWs)(we) — [12] 


The space Kaige obtained in this manner is separable. 
The images |s) = Pag|S) of the spin network states 
are called s-knot states. They span Kg. They are 
determined only by the diffeomorphism equivalence 
class s of the spin network $. Namely, by an abstract 
(non-imbedded) knotted graph, colored with spins 
and intertwiners. These colored knots are called 
s-knots or abstract spin networks. The s-knot states 
have a straightforward physical interpretation as 
quantum excitations of space, discussed below. 


Operators and Quanta of Space 


The state space defined above carries a quantum 
representation of classical observables of general 
relativity. The classical quantity U[A, ^], a function 
of the field variable A, acts naturally as a multi- 
plicative operator on K. Thus, K provides a 
Schrödinger functional representation [A] of quan- 
tum gravity, which diagonalizes the (holonomy of 
the) gravitational connection. The two constraints 
[1] and [2] generate SU(2) gauge and diffeomorph- 
ism transformations on A. The corresponding 
transformations on the Schródinger functional states 
V|A] are given by the unitary representations 
mentioned above. The quantum implementation of 
the two constraint equations [1] and [2], following 
Dirac's theory of constrained quantum systems, is 
the requirement of invariance under these transfor- 
mations. The space Kyi is the solution to these 
requirement. 

The triad field operator E can be defined only if 
suitably smeared. Since E is a 2-form, its geometri- 
cally natural smearing is with a 2D surface. (The 
I-form field A is smeared over a line in U[A, ^].) 
Given a finite 2D surface S: 0 = (o!,0^) — x*(e) € X, 
the smeared field 


, 2» Ayb 
E[S] = / tà / doeane — E'(x(o)) [13] 


Ja! Vor 


is quantized by the functional derivative operator 


E[S] = —ib [14] 


87G J 5 Ox" Ax” ó 
2 d O €abc 
C? s 


This operator is well defined on K and the quantum 
operators E[S] and U[A, ^| define a linear represen- 
tation of the Poisson algebra of the corresponding 
classical quantities. Thus, they define a quantization 
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of the kinematics of general relativity. Notice that in 
a general covariant quantum field theory field 
operators can be well defined even if smeared on 
low-dimensional regions, while in conventional 
quantum field theory, these operators need to be 
smeared over 3D or 4D regions. 

A simple calculation shows that if S and ^ 
intersect once, 


E,[S]U[A, y] = +ib 


= UJA, ylvU[A, y2] [15] 
where v € su(2), we have written E, —tr[vE], 7.2 
are the two paths into which y is partitioned by the 
surface, and the sign is determined by the relative 
orientation of S and ^. More generally, E[S]U[A, ^] 
Is a sum of one such term per intersection between S 
and y. 

Composite operators can be constructed in terms 
of these operators. In particular, using standard 
formulas in classical general relativity, the area of 
the surface S can be written as a Riemann sum 


A[S] = lim X` y/tr[E(S,)E(S,)] [16] 


where S,,7 —1,..., N, is a Riemann partition of 
the surface. A straightforward calculation based on 
eqn [15| shows that, if S cuts z links of a spin 
network carrying spins (/1.../5) —j, then the spin 
network state |$) is an eigenstate of A[S] with 
eigenvalue 


SrbG —- 
-AEF 


C 1 9 


where j;—1/2,1,3/2,2,... These are therefore 
discrete eigenvalues of the area. All eigenvalues of 
the area operator A[S] are real and discrete and 
A[|S] is a self-adjoint operator. Similar results are 
obtained for the volume operator. This gets a 
discrete contribution for each node of a spin 
network. 

These spectral properties of the area and volume 
operators determine the physical interpretation of 
the spin network states: the nodes of the spin 
network represent quanta of space with quantized 
volume; the nodes are connected by links represent- 
ing quanta of surface with quantized area. The 
graph I determines the adjacency relations between 
the individual quanta of space; the intertwiners i,, 
are volume quantum numbers; the spins j, are area 
quantum numbers. 

The interpretation carries over to the s-nodes, which 
represent the same quantum excitations of space, up to 
its manifold coordinatization, which is physically 
irrelevant because of the gauge invariance under 
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Figure 1 The graph of an s-knot, namely an abstract spinfoam, 
and the set of quanta of space it represents. Each node n of the 
graph defines a quantum of space. The associated intertwiner /, is 
the corresponding volume quantum number. Two quanta of space 
are adjacent if the corresponding nodes are linked. A link ^; cuts 
the elementary surface separating the two quanta and its spin j, is 
the area quantum number of this surface. 


diffeomorphisms of X. An s-knot state |s) with N 
nodes represents a quantum excitation of space with N 
quanta of space adjacent to one another according to 
the connectivity of I (see Figure 1). 

Notice that the quantum states |s) do not 
represent quantum excitations living in the physical 
space: they represent quantum excitations of the 
physical space. For instance, the state |0) defined by 
the empty graph does not represent an “empty” 
physical space, but the absence of any physical 
space. A generic quantum state of the physical space 
is represented by a normalizable linear superposition 
of these discrete quantized spacetimes (see Knot 
Invariants and Quantum Gravity). 

In a nongeneral covariant context, the kinematical 
quantization predictions of quantum theory (such as 
the quantization of the angular momentum) are 
obtained from the spectral properties of operators 
that represent measurements at a given time. In the 
general covariant Hamiltonian formalism, the corre- 
sponding kinematical quantization predictions are 
given by spectral properties of “partial observables” 
operators, which in general are not gauge invariant in 
the sense of Dirac. Area and volume are partial 
observables of this kind. Their spectra are therefore 
interpreted as physical predictions of LQG (up to an 
overall numerical factor, called the Immirzi parameter, 
which is obtained in certain variants of the theory). 


Dynamics 


The dynamics of the theory is obtained in terms of a 
“Hamiltonian constraint” operator C that quantizes 
the constraint [3]. Different variants of the operator 
C, and of its Lorentzian version, have been 
constructed. The operator is defined via a suitable 
regularization procedure. The description of these 
constructions exceeds the scope of this article, and 


we limit ourself here to mentioning the main result 
and a few general comments. 

The main result of the LQG dynamics is that C 
turns out to be well defined and ultraviolet-finite 
when restricted to Kaige. Finiteness holds also when 
standard matter couplings, such as Yang-Mills fields 
and fermions, are added. 

The reason for this finiteness can be understood as a 
consequence of the discrete nature of space implied by 
the spectral properties of the geometric operators 
described above. The limit in which the ultraviolet 
cutoff, introduced to regulate C, is removed turns 
out to be trivial on the diffeomorphism-invariant states 
in Kg. This is because this limit probes the short- 
distance regime, but there is no physical (gauge- 
invariant) short distance, in a theory in which 
geometry turns out to be quantized at the Plank 
scale. Since the physical states in Kyi define a physical 
geometry only at scales larger than the Planck scale 
bGc ^, the “short-distance” modes in the coordinate 
manifold © turn out to be pure gauge. This interplay 
between quantum  field-theoretical and  general- 
relativistic physics is the distinctive character of LQG. 

Finally, we sketch the formal structure that 
dynamics can take in the general covariant 
Hamiltonian formalism of LQG. The operator C 
defines a linear operator P ~ ó(C), usually (impro- 
perly) denoted the “projector,” which sends states in 
Kis into the kernel of C, formed by the generalized 
Kaige vectors that solve the Wheeler-De Witt equa- 
tion CV — 0 (see Wheeler-De Witt Theory). Matrix 
elements of P are interpreted as transition ampli- 
tudes between quantum states of space. 

Physical predictions for processes that take place 
in a finite spacetime region R can be obtained, in 
principle, as follows. One considers a state |W) 
representing the result of the measurement of partial 
observables of the 3D boundary of a spacetime 
region R. |V) codes the nonrelativistic notions of 
initial, boundary and final conditions. Then (0|P|W) 
can be interpreted as a relative probability ampli- 
tude associated to this result. A formal expansion of 
this amplitude in powers of C generates a spinfoam 
sum (see Spin Foams) that can be understood as the 
“quantum gravity sum over histories" in R. 

A systematic technique for computing physical 
transition amplitudes from the  background- 
independent and nonperturbative formalism of 
LQG has not yet been developed. 
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Introduction 


Einstein's (1916) use of differential geometry as an 
essential tool in his theory of general relativity has 
long been a motivation for the study of Lorentzian 
geometry. More recently, the influential mono- 
graphs of R Penrose (1972) and of S Hawking and 
G Ellis (1973), the latter still cited by some as the 
Bible of general relativity, so fascinated differential 
geometers that Lorentzian geometry took its place 
alongside of global Riemannian geometry as a 
worldwide research area. 

Let M be a smooth n-dimensional manifold, n > 2, 
with a countable basis. A Lorentz metric g= < , > 
on M is a symmetric nondegenerate (0, 2) tensor field 
on M of index (—, +,..., +). The existence of such 
a tensor field implies that M admits a (non-oriented) 
line field; hence, some compact manifolds like S? do 
not admit such metrics. A nonzero tangent vector v in 
TM is then timelike (resp., nonspacelike, null, space- 
like) according to whether g(v,v) - O0 (resp. 
«0, —0, >0). A Lorentzian manifold (M,g) is a 
pair consisting of a smooth manifold together with a 
choice of Lorentz metric. In this article, we use the 
convention that a spacetime (M,g) is a Lorentzian 
manifold together with a choice of time orientation, 
that is, a continuous timelike vector field X on M. 
Then a tangent vector v based at p may be 
consistently defined to be future (resp., past) directed 
if g(X(p),v) « 0 (resp.,> 0). (Some authors also 
require that (M, g) be space oriented.) If a Lorentzian 
manifold happens not to be time orientable, then a 
2-fold covering manifold with the induced pullback 
metric will be time orientable. Also basic are the 
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notations p <q (resp., p < q) if there is a future- 
directed timelike (resp., nonspacelike) curve from f 
to q and the corresponding chronological (resp., 
causal) future of p given by I*(p)={q € M; p < q} 
and /* (p) — (q € M; p < q}. 

For a Riemannian manifold (N, go), the Riemannian 
distance function 


dy: N x N — (0, +00) [1] 


given by do(p, q) = inf (L(c); c: [0,1] — N is a piece- 
wise smooth curve with c(0)=p and c(1)=q}. A 
fundamental result in global Riemannian geometry 
is the celebrated Hopf-Rinow theorem. 


Hopf-Rinow Theorem For 
manifold | (N, go), 
equivalent: 


any | Riemannian 
the following conditions are 


(i) metric completeness: (N,do) is a complete 
metric space; 

(ii) geodesic completeness: for any v in TN, the 
geodesic c,(t) in N with initial condition 
c,(0) —v is defined for all values of an affine 
parameter t; 

(iii) for some point p in N, tbe exponential map 
exp, is defined on all of T,N; 

(iv) finite compactness: every subset K of N that is 
do bounded bas compact closure. 

Moreover, if any one of (i)-(iv) holds, then 
(N, go) also satisfies 

(v) minimal geodesic connectedness: given any p, q 
in N, there exists a smooth geodesic segment 
c:[0,1] -~N with c(0)=p,c(1)=q and 
L(c) = do(p, q). 


A Riemannian metric for a smooth manifold is 
then said to be complete if it satisfies any of the 
above properties (i) through (iv). The Heine-Borel 
property of basic topology implies (via (iv)) that all 
Riemannian metrics for a compact manifold are 
automatically complete and many of the examples 
studied in basic Riemannian geometry are complete. 
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Also, if Riem(N) denotes the space of all Rieman- 
nian metrics for a smooth manifold N, both geodesic 
completeness (property (ii) above) and geodesic 
incompleteness (the failure of property (ii) to hold 
for all geodesics) are C" stable properties on 
Riem(N), that is, given a complete (resp., incom- 
plete) metric g for N, there exists an open neighbor- 
hood U(g) of g in Riem(N) in the Whitney C? fine 
topology such that all Riemannian metrics þh in U(g) 
are complete (resp., incomplete). 

For spacetimes (M,g), however, many basic 
examples furnished by general relativity fail to be 
geodesically complete and compactness of the 
underlying smooth manifold M does not imply that 
the given Lorentz metric g (let alone all Lorentz 
metrics for M) are complete. Also, the stability of 
geodesic completeness and incompleteness is more 
complicated than in the Riemannian case, necessi- 
tating concepts like pseudoconvex geodesic systems 
and disprisonment as studied by Beem and Parker. 
To summarize, for spacetimes and their associated 
Lorentzian distance functions, no naive analogs 
for the Hopf-Rinow theorem are valid. Under 
additional hypotheses, geodesic completeness may 
be guaranteed. Marsden noted that a compact 
spacetime with a homogenous Lorentz metric is 
geodesically complete. Then Carriere showed that a 
compact spacetime whose curvature tensor vanishes 
is geodesically complete. Later Kamishima (assum- 
ing constant curvature) and then Romero and 
Sanchez more generally showed that a compact 
Lorentzian manifold which admits a timelike Killing 
field is geodesically complete. 

At any point p in a given spacetime, emanating 
from p are three families of geodesics: timelike, 
spacelike, and null. It was hoped in the 1960s that 
possibly continuity arguments could be obtained for 
different types of geodesic completeness. However, a 
series of examples showed by the mid-1970s that 
timelike geodesic completeness, null geodesic com- 
pleteness, and spacelike geodesic completeness are 
logically inequivalent. (Here, a given geodesic is said 
to be complete if it may be extended to be defined 
for all values of an affine parameter.) Nomizu and 
Ozeki for Riemannian manifolds showed that any 
given Riemannian metric go for the smooth mani- 
fold N could be made geodesically complete by 
making a conformal change of metric Ogo, where 
Q:N — (0, +00) is a smooth function. Especially in 
general relativity, such conformal changes are 
natural because the causal character of tangent 
vectors and curves (and hence of the basic causality 
conditions) are preserved. For spacetimes while 
generally nonspacelike geodesic completeness could 
not be produced by conformal changes, for some 


subclasses of spacetimes, such as the strongly causal 
ones, it was possible with a global conformal 
change. 

For a large class of spacetimes, the warped or 
multiwarped products (originally inspired by several 
cosmological models in general relativity and a basic 
construction from Riemannian geometry), explicit 
integral criterion involving the warping functions 
have been given for timelike or null geodesic 
completeness. Several early examples of this type 
of result are discussed in Beem et al. (1996, 
pp. 111-112). 


Lorentz Distance and the Nonspacelike 
Cutlocus 


For an arbitrary, not necessarily complete, Riemannian 
manifold (N,go), the Riemannian distance function 
given in eqn [1] is continuous, the metric topology 
induced by do coincides with the given manifold 
topology, and do(p,q) is finite for all p,q in N. 
Now, for an arbitrary spacetime (M,g), and p,q 
in M, if there is no future-directed nonspacelike 
curve from p to q, set d(p,q) — 0; if there is such a 
curve, let 


d(p, q) = sup{L(c); c: [0, 1] — (M, g) 
is a piecewise smooth future- 
directed nonspacelike curve 
with c(0) = p and c(1) = q} [2] 


(Unlike the Riemannian case, |2] does not bound 
d(p,q) from above by L(c) for any selected curve c 
and hence the Lorentz distance may assume the 
value +00.) 

This then defines what some authors term the 
“Lorentzian distance function” 


d = d(g):M x M — [0, +00] [3] 


and other authors term “proper time." It is linked to 
the causal structure of the given spacetime since 


d(p, q) > 0 iff q is in I (p) [4] 


and in place of the triangle inequality for the 
Riemannian distance function, a reverse triangle 
inequality holds: 


if p € r € q, then d(p, q) >d(p,r)+d(r.q) [S] 


Also in the context of eqn [2], a future-directed 
nonspacelike curve c:[0, 1] —^ M from c(0)— p to 
c(1)=q is defined to be maximal if L(c) — d(p, q). 
Corresponding to the Riemannian theory, a max- 
imal nonspacelike curve turns out to be a smooth 
null or timelike geodesic segment. 


As mentioned earlier, geodesic completeness is 
generally not a natural requirement to place on 
a spacetime. But what emerges from [4] in place of 
Riemannian completeness is an interplay between 
the causal properties of the given spacetime and 
the continuity (and other properties) of the 
Lorentzian distance function (cf. Beem et al. (1996, 
chapter 4)) At the extreme of totally vicious 
spacetimes, the Lorentz distance is always +00. 
Less drastically, if (M,g) contains a closed timelike 
curve passing through p, then d(p,4q) — +2 for all 
q in J'(p) Also, certain cosmological models 
contain pairs of points at infinite distance. In 
general, Lorentzian distance is only lower semicon- 
tinuous. Adding upper semicontinuity forces a 
distinguishing spacetime to be causally continuous. 
A spacetime is chronological iff d(p, p) — O0 for all p 
in M. At the other extreme from totally vicious 
spacetimes are globally hyperbolic spacetimes, 
which share many properties somewhat analogous 
to complete Riemannian manifolds. The Lorentzian 
distance function of a globally hyperbolic spacetime 
is both continuous and finite valued. (Indeed, a 
strongly causal spacetime is globally hyperbolic iff 
all Lorentz metrics g’ in the conformal class 
C(M,g) also have finite-valued distance functions 
d(g').) Second, corresponding to property (v) of the 
Hopf-Rinow Theorem, these spacetimes all satisfy 
maximal  nonspacelike geodesic  connectability: 
given any p, q in M with p € q, there exists a 
future nonspacelike geodesic segment c : [0,1] — M 
with c(0) — p, c(1) 2 q and L(c) ^ d(p, q). 

A basic concept from the calculus of variations is 
that of a pair of conjugate points along a geodesic 
segment c : [0,4] — (M,g). A smooth vector field 
J(t) along c is said to be a “Jacobi field” if J satisfies 
the Jacobi differential equation 


J" +RU, dye 20 i6] 


where R denotes the curvature tensor. Then 
c(t), c(s) are said to be conjugate points along c if 
there exists a nonzero Jacobi field / along c with 
J(t) (s) 2 0. Much of the basic comparison tech- 
niques in global Riemannian geometry involving 
lengths of geodesics in manifolds satisfying curva- 
ture inequalities, such as the “Rauch comparison 
theorems,” the “Toponogov triangle comparison 
theorem," and volume comparison theorems, were 
first obtained through Jacobi field techniques 
(cf. Petersen (1998) for a contemporary account). 
Later, Riccati equation techniques became more 
popular (cf. Karcher (1989)). For spacetimes, espe- 
cially in the globally hyperbolic case, analogous 
results have been obtained for nonspacelike geodesic 
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segments, with a key breakthrough in 1979 being 
Harris’s version of the “Toponogov triangle com- 
parison theorem” for timelike geodesic triangles in 
globally hyperbolic spacetimes. The Raychaudhuri 
equation used earlier in general relativity corre- 
sponds for spacetimes to this passage in the 
Riemannian setting from the Jacobi equation to the 
Riccati equation. The basic conjugate point theory 
and the Morse index theory for an arbitrary timelike 
or null geodesic segment in a general spacetime are 
reasonably close to the earlier Riemannian theory, if 
vector fields of the form J(t) =f (t)9’(t) are accounted 
for in the case of a null geodesic segment 
8 : [0,1] —^(M,g). But spacelike geodesics and 
conjugate points are more problematic, as was first 
established using symplectic techniques by Helfer in 
1994. More recently, progress has been made in 
applying important ideas of Gromov (1999) for 
Riemannian manifolds to the spacetime context 
(cf. Noldus (2004) for an example). 

Inspired by fundamental concepts in global 
Riemannian geometry, Beem and Ehrlich in 1979 
introduced the concept of  nonspacelike cut 
point, again most tractable for globally hyperbolic 
spacetimes. Let ~: [0,a)— (M,g) be a future- 
inextendible, future-directed nonspacelike geodesic 
in an arbitrary spacetime. Define 


to = sup{t € [0, a); d(7(0), y(t)) = L(V0,4)5 [7] 


(If there is a closed timelike curve through ^(0), 
then d(7(0),7(0)) = +20 and £o will not exist. If y is 
a  nonspacelike geodesic ray and hence 
d(^((0), (£)) = L(^ljo, 4) for all t, then tọ =a.) How- 
ever, if O < to < a, then (t9) is said to be the future 
nonspacelike cut point of p-—^4(0) along y. For 
general spacetimes, it may be shown that: 


1. for O<s<t<tg, that "llis. is the unique 

maximal nonspacelike geodesic in all of (M, g) 

between ^(s) and q(t); 

lioe iS maximal for all t with 0 € t < to; and 

. for all t with to <t<a, there is a longer 
nonspacelike curve in (M, g) than 7|j9 ,, between 
?(0) and y(t). 


Ww N 


A nonspacelike cut point is a subtler concept than 
a nonspacelike conjugate point since the existence of 
a cut point is not necessarily captured by the 
behavior of families of future nonspacelike curves 
(or geodesics) close to the given geodesic segment y, 
the basic viewpoint of the calculus of variations. But 
since calculus of variations arguments shows that 
past a nonspacelike conjugate point, longer “neigh- 
boring curves" join 7(0) to y(t), the future cut point 
of p —^4(0) along y comes no later than the first 
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future conjugate point to p along y in either the 
timelike or null geodesic case. 

In a startling result which contradicted erroneous 
arguments in all the standard textbooks, Margerin 
in 1993 gave examples to show that even for 
compact Riemannian manifolds, the first conjugate 
locus of a point (i.e., the set of all first conjugate 
points along all geodesics issuing from a given point) 
need not be closed, even though elementary argu- 
ments correctly show that the cut locus of any point 
(i.e., the set of all cut points along all geodesics 
issuing from the given point) is always closed. The 
timelike first conjugate locus of a point in a 
spacetime will generally not be closed, but because 
a nonspacelike geodesic in a globally hyperbolic 
spacetime must escape from any compact subset in 
finite affine parameter, the future (or past) first 
nonspacelike conjugate locus of any point in such a 
spacetime is a closed subset. In a result analogous to 
the Riemannian characterization, nonspacelike cut 
points in globally hyperbolic spacetimes may be 
characterized as follows: let q = (to) be the future 
cut point of p —^(0) along the timelike (resp., null) 
geodesic segment y from p to q. Then either one of 
both of the following conditions hold: (1) q is the 
first future conjugate point to p along y, or (2) there 
exist at least two maximal timelike (resp., null) 
geodesic segments from p to q. 

Now given p in an arbitrary spacetime (M,g), the 
future timelike (resp., null) cut locus of p is defined 
to be the set of all timelike (resp., null) cut points 
along all future timelike (resp., null) geodesics 
issuing from p and the future nonspacelike cut 
locus of p is defined as the union of the future 
timelike and null cut loci. Employing alternatives 
(1) and (2) in the preceeding paragraph, it may be 
shown for globally hyperbolic spacetimes that the 
null and nonspacelike cut loci are closed subsets 
of M. 

The null cut locus has a privileged status 
by virtue of a phenomena not encountered for 
Riemannian manifolds. Under a conformal change 
of back-ground spacetime metric, null geodesics 
remain null pregeodesics (i.e., may be reparame- 
trized to be null geodesics in the deformed Lorentz 
metric) while such deformations fail to preserve 
timelike or spacelike geodesics, or to preserve 
geodesics in the Riemannian case. Even though 
null conjugate points along a null geodesic will not 
remain invariant under conformal change of space- 
time metric, it is remarkable that elementary 
arguments involving the spacetime distance func- 
tion show that global conformal diffeomorphisms 
do preserve null cut points and hence the null cut 
locus of any point. 


Geodesic Incompleteness and the 
Lorentzian Splitting Theorem 


In global Riemannian geometry, an important concept 
is that of a geodesic ray. In a complete Riemannian 
manifold (N,go), a unit geodesic c:[0,--oc) > 
(N,go) is said to be a (geodesic) ray if do(c(0), 
c(t)) — t for all t > 0. By the triangle inequality, c(t) is 
minimal between every pair of its points. By making a 
limit construction, it may be shown that for each p in 
N, there exists a geodesic ray c(t) with c(0) — p. An 
allied concept is that of a (geodesic) line c: R — 
(N, go); here do(c(t), c(s)) — |t — s| for all t, s is required, 
that is, c is minimal between every pair of its points. 
The existence of a line is much stronger than the 
existence of a ray. If (N,go) has positive Ricci 
curvature everywhere, then (N,go) contains no lines 
despite the fact that it contains a ray issuing from 
each point. A helpful tool in this setting is the 
compactness of sets of tangent vectors of the form 


{w € T N; go(w, w) = 1] [8] 


for any p in N; hence, any infinite sequence of 
tangent vectors based at p automatically has a 
convergent subsequence. 

For spacetimes, geodesic completeness cannot 
generally be assumed. Yet a future nonspacelike 
geodesic ray ^ : [0, b) — (M, g) may be defined to be 
a future-directed, future-inextendible nonspacelike 
geodesic with d(^(0), ^(t)) = L(¥\ V0, 4)) for all ¢ in 
[0, b). The reverse triangle inequality implies that ^ 
is maximal between any pair of its points. Similarly, 
a nonspacelike geodesic line y : (a,b) —^ (M,g) is a 
past- and future-inextendible nonspacelike geodesic 
with d(^(t), y(s)) = L(y|j 4) for all s, t. Hence, y is 
maximal between any pair of its points. If nonspace- 
like geodesic completeness is assumed, a = —oo and 
b = +æ above. Constructions here are more delicate 
than in the Riemannian case because the sets 


lv € TM; g(v, v) = —1} [9] 


of unit timelike tangent vectors, while closed in the 
tangent space, are noncompact. Despite this techni- 
cality, using the limit curve machinery of general 
relativity in place of the compactness in [8], it has 
been shown that a strongly causal spacetime admits 
a past and future nonspacelike geodesic ray issuing 
from every point (cf. Beem et al. (1996, chapter 8)). 
(If the spacetime is not nonspacelike geodesically 
complete, these rays will not necessarily be past or 
future complete.) As in the Riemannian case, the 
existence of a complete line is a stronger geometric 
condition. For that reason, in 1977 Beem and 
Ehrlich introduced the concept of a spacetime 
causally disconnected by a compact set K and 


showed that a strongly causal spacetime which is 
causally disconnected by a compact set contains a 
nonspacelike geodesic line which intersects the 
compact set. (Again, unless the spacetime is non- 
spacelike geodesically complete, this line need not be 
future or past complete.) 

A pattern common to many results in global 
Riemannian geometry especially since the 1950s is 
the following: the existence of a complete Riemannian 
metric on a smooth manifold which also satisfies a 
global curvature inequality implies a topological or 
geometric conclusion. A celebrated early example 
from the 1950s and 1960s, obtained by separate 
results of Rauch, Berger, and Klingenberg, is the 
topological sphere theorem. 


Topological Sphere Theorem Suppose (N, go) is a 
complete, simply connected Riemannian n-manifold 
whose sectional curvatures satisfy 1/4<K< 1. 
Then N is bomeomorpbic to S". 


By contrast, for spacetimes, the assumption of 
geodesic completeness is generally unwarranted. 
Here is an example of one of the celebrated 
singularity theorems of general relativity, published 
in 1970 as originally stated: 


Hawking-Penrose Singularity Theorem No space- 
time (M, g) of dimension n > 3 can satisfy all of the 
following three requirements together: 


(1) (M,g) contains no closed timelike curves; 
(ii) Every inextendible nonspacelike geodesic in 
(M, g) contains a pair of conjugate points; and 
(ui) There exists a future- or past-trapped set S in 
(M, g). 


This theorem may be reinterpreted more akin to 
the Riemannian pattern above as follows: suppose 
(M,g) is a chronological spacetime of dimensions 
n >3 which satisfies the; timelike convergence 
condition (Ric(v,v) >0O for all timelike tangent 
vectors) and the generic condition (every inextend- 
ible nonspacelike geodesic contains a point which 
has some appropriate nonzero sectional curvature). 
If (M, g) contains a future- or past-trapped set, then 


(M,g) is nonspacelike geodesically incomplete. 
Hence, this result models the pattern: global 
curvature inequalities (reflecting the physical 


assumptions that gravity is assumed to be attractive 
and every inextendible nonspacelike geodesic experi- 
ences tidal acceleration) and a further physical or 
geometric assumption (the first and third conditions) 
implies the existence of an incomplete timelike or 
null geodesic. 

An influential concept in global Riemannian 
geometry formulated during the 1960s and 1970s 
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is that of curvature rigidity, which first became 
widely known through the introduction to the text 
Cheeger and Ebin (1975). The above statement of 
the “sphere theorem” contains one hypothesis that 
the sectional curvature is strictly greater than 1/4. 
In curvature rigidity, the hypothesis of strict 
inequality is relaxed to include the possibility of 
equality as well, and then one tries to show that 
either the old conclusion is still valid, or if it fails, it 
fails in an isometric (hence “rigid”) manner. Thus 
in the example of the sphere theorem, if the 
sectional curvature is now allowed to satisfy 1/4 < 
K < 1, then either the given Riemannian manifold 
remains homeomorphic to the z-sphere, or if not, it 
is isometric to a Riemannian symmetric space of 
rank 1. 

Already in -an article in 1970, Geroch had 
expressed the opinion that most spacetimes should 
be nonspacelike geodesically incomplete and also 
that a spacetime should fail to be nonspacelike 
geodesically incomplete only under special circum- 
stances. Apparently by the early 1980s, S T Yau had 
formulated the idea that timelike geodesic incom- 
pleteness of spacetimes ought to display a curvature 
rigidity. In the paragraph following the statement of 
the Hawking-Penrose singularity theorem, there are 
two curvature conditions mentioned — the timelike 
convergence condition and the generic condition. 
Now the timelike convergence condition already 
allows for the case of equality (i.e., zero timelike 
Ricci curvature) in its formulation; hence, curvature 
rigidity here would imply dropping the generic 
condition that each inextendible nonspacelike geo- 
desic contains a point of nonzero sectional curva- 
tures as a hypothesis. This notion seems first to have 
been published by Yau’s Ph.D. student R Bartnik in 
1988 as follows: 


Conjecture Let (M,g) be a spacetime of dimension 


>3 which 


(i) contains a compact Cauchy surface and 
(ii) satisfies the timelike convergence condition 
Ric(v,v) > 0 for all timelike v. 


Then either (M,g) is timelike geodesically incom- 
plete, or (M,g) splits isometrically as a product 
(IR x V, —di? +h) where (H,b) is a compact 
Riemannian manifold. 


This conjecture has been proven in many cases 
with the following proof scheme. From the physical 
or geometric assumptions made, produce an 
inextendible nonspacelike geodesic line. Further, 
prove that the line happens to be timelike rather 
than null. Then if the spacetime were timelike 
geodesically complete, it would contain a complete 
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timelike line. But then the desired splitting may be 
obtained using the Lorentzian splitting theorem. 


Lorentzian Splitting Theorem Let (M,g) be a 
spacetime of dimension >3 which satisfies each of 
the following conditions: 


(i) (M,g) is either globally hyperbolic or timelike 
geodesically complete; 
(11) (M,g) satisfies the timelike convergence condi- 
tion; and 
(iii) (M, g) contains a complete timelike line. 


Then (M, g) splits isometrically as a product (R x V, 
dt? +h) where (H,b) is a complete Riemannian 
manifold. 


This result, which corresponds to obtaining the 
spacetime analog of a celebrated splitting theorem of 
Cheeger and Gromoll for lines in complete Riemannian 
manifolds of non-negative Ricci curvature, published 
in 1971, was posed as a problem by S T Yau in a 
problem list stemming from the conference 
Special Year in Differential Geometry held at the 
Institute for Advanced Study in Princeton during the 
1979-80 academic year. Early progress was made 
using maximal hypersurface methods by Gerhardt in 
1983, Bartnik in 1984, and Galloway in 1984. Then 
in 1985, Beem, Ehrlich, Markvorsen, and Galloway 
introduced the methodology of employing the 
Busemann function of the complete timelike line, 
motivated by techniques from Riemannian geome- 
try, and succeeded in obtaining a splitting under the 
hypothesis of global hyperbolicity and everywhere 
nonpositive timelike sectional curvatures. In separate 
publications, Eschenburg and Galloway extended 
the result to the desired curvature hypothesis of 
nonnegative timelike Ricci curvatures. Finally, 
Newman in 1990 achieved the originally desired 
goal of obtaining the splitting under the assumption 
of timelike geodesic completeness, rather than global 
hyperbolicity. This is a more delicate setting, since 
timelike geodesic completeness does not imply 
maximal nonspacelike geodesic connectability, a 
fairly basic geometric tool in many standard 
constructions. But the idea emerged with 
Newman's solution that the existence of a timelike 
geodesic line or segment in a nonglobally hyper- 
bolic spacetime implies an adequate level of control 
in a tubular neighborhood of the given line to 
enable the proof to work. Galloway and Horta in 
1996 published a much simplified working out of 
these concepts. A fuller exposition of these devel- 
opments may be found in Beem et al. (1996, 
chapter 14). In addition, in 2000, Galloway 
published a version of the splitting theorem for a 
null maximal geodesic line. 


Two-Dimensional Spacetimes 


Two-dimensional spacetimes, sometimes termed 
Lorentz surfaces, are especially tractable because 
given (M,g) with dim M —2, then (M, —g) is also a 
spacetime. Hence, it may be shown that any 
Lorentzian 2-manifold (M,g) homeomorphic to R? 
may be made geodesically complete (not just 
nonspacelike geodesically complete) by a conformal 
change of metric. Also, any simply connected two- 
dimensional Lorentzian manifold is strongly causal. 
In Weinstein (1996), an extensive study is made of 
Lorentz surfaces generally and particularly, of a 
conformal boundary for such surfaces first given by 
Kulkarni in 1985. 

One of the prettiest classical results linking the 
geometry and topology'of a Riemannian surface 1s 
the Gauss-Bonnet theorem. Let (N,go) be a 
Riemannian manifold of dimension 2 and let P be a 
polygonal subregion with piecewise smooth bound- 
ing curves cj, 1 €i € k. Let K denote the Gauss 
curvature of (N, go) and « the geodesic curvature of 
the smooth curves c; (which vanishes if c; happens to 
be a geodesic). If o; denote the corresponding 
interior angles between the successive boundary 
curves c; and c;,,, then the Gauss-Bonnet formula 
over P is 


i KdA + J, n ds Do- ste [10 


By considering a triangulation of N itself and 
summing up the corresponding terms in [10], it 
follows for a compact oriented Riemannian mani- 
fold (N, g9) of dimension 2 that 


JN. 


where x(n) denotes the Euler characteristic. Also 
lurking in the background here is a formula for 
computing the angle between unit tangent vectors v, 
w as 


cos? = go(v, w) [12] 


In the spacetime setting, different versions of a 
Gauss-Bonnet formula for subregions of a two- 
dimensional spacetime (M, g) corresponding to [10] 
have been given in 1974 by Helzer and in 1984 by 
Birman and Nomizu. First, the angle computation is 
a bit trickier for spacetimes than in the Riemannian 
case; eqn [12] has to be replaced by techniques 
which use the hyperbolic functions coshu and 
sinhu to define the angle z (sometimes called the 
"hyperbolic angle") between two unit vectors and 
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then to allow for null vectors. Birman and Nomizu 
obtained an analog of |10] assuming that the 
boundary curves for P are successive smooth unit 
timelike curves: 


| kds — | | Kaa +74, =0 
JOP JP. 


Helzer in his formulation allows the different 
boundary curves to be either unit timelike, unit 
spacelike or null separately. Since the only compact, 
orientable smooth surface which admits a spacetime 
metric is the 2-torus, which has zero Euler char- 
acteristic, the Riemannian formula [11] above 
translates into the uniform constraint on the Gauss 
curvature of the spacetime: 


[Jeo 


See also: General Relativity: Overview; Geometric 
Analysis and General Relativity, Pseudo-Riemannian 
Nilpotent Lie Groups; Spacetime Topology, Causal 
Structure and Singularities. 
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Lyapunov Exponents 


The Lyapunov exponents of a sequence [A",7 > 1} 
of square matrices of dimension d > 1 are the values 
of 


A(v) — lim nup IA" - v|| [1] 
ras # 
over all nonzero vectors v € R^. For completeness, 
set A(0) = —oc. It is easy to see that A(cv) = A(v) and 
Alu + v') € max{A(v), A(v’)} for any nonzero scalar c 
and any vectors v, v/. It follows that, given any 
constant a, the set of vectors satisfying A(v) <a is a 
vector subspace. Consequently, there are at most d 
Lyapunov exponents, henceforth denoted by 


Ap < Api <Apg, and there exists a filtration 
Fl<...<Fe1<PFk=R*? into vector subspaces, 
such that 


A(v) =A; for all v € FAE; 


and every 1=1,...,k& (write Fo = (0]). In particular, 
the largest exponent is 


A, = lim siio-- Ie IA" | [2] 
n—oo M 

One calls dim F; — dim F; the multiplicity of each 
Lyapunov exponent A;. 

There are corresponding notions for continuous 
families of matrices A', t € (0, oc), taking the limit as 
t goes to oo in the relations [1] and [2]. The theories 
for the two types of families, discrete and contin- 
uous, are analogous and so at each point in what 
follows we refer to either one or the other. 
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Lyapunov Stability 
Consider the linear differential equation 
b(t) = B(t) - v(t) [3] 


where B(t) is a bounded function with values in the 
space of d x d matrices, defined for all t € R. The 
theory of differential equations ensures that there 
exists a fundamental matrix A',7 € R, such that 


v(t) = A’. Vo 


is the unique solution of [3] with initial condition 
v(0) = Uo. 

If the Lyapunov exponents of the family A‘,t> 0, 
are all negative then the trivial solution v(t) = 0 is 
asymptotically stable, and even exponentially stable. 
The stability theorem of Lyapunov asserts that, 
under an additional regularity condition, stability is 
still valid for nonlinear perturbations 


w(t) = B(t) -w + F(t.w) |4] 


with ||F(t, w)|| < const. |w|| ^*, c » 0. That is, the 
trivial solution w(t) = 0 is still exponentially asymp- 
totically stable. 

The regularity condition means, essentially, that 
the limit in [1] does exist, even if one replaces 
vectors v by elements v; ^ --- ^ vj of any lth exterior 
power of R^, 1 € | « d. By definition, the norm of an 
l-vector vı A--- Av; is the volume of the parallele- 
piped determined by the vectors vij,...,v;. This 
condition is usually tricky to check in specific 
situations. However, the multiplicative ergodic 
theorem of VI Oseledets asserts that, for very 
general matrix-valued stationary random processes, 
regularity is an almost sure property. This result sets 
the foundation for the modern theory of Lyapunov 
exponents. We are going to discuss the precise 
statement of the theorem in the slightly broader setting 
of linear cocycles, or vector bundle morphisms. 


Linear Cocycles 


Let u be a probability measure on some space M and 
f:M—M be a measurable transformation that 
preserves u. Let 7:£ — M be a finite-dimensional 
vector bundle, endowed with a Riemannian metric 
| - ||, on each fiber Es =r (x). Let A:E—E be a 
linear cocycle over f. What we mean by this is that 


moA-—fom 


and the action A(x):£, — Efx) of A on each fiber is 
a linear isomorphism. Notice that the action of the 
nth iterate A” is given by 


A" (x) = A(f"^ (x)) «+ A(f(x)) - A(x) 


for every n> 1. 
Assume the function log' ||A(x) 


|. is p-integrable: 
log* AGO), € L'(u) [5] 


(we write log’ ó— log max (ó,1], for any 6» 0). 
It is clear that the sequence of functions 
a, (x) = log ||A”(x)||, satisfies 


dmn (x) € aAm(x) + an(f""(x)) 


for every m, n, and x. It follows from J Kingman’s 
subadditive ergodic theorem that the limit 


lim —a,(x) 

n—^2o6H 
exists for u-almost all x. In view of [2], this means 
that the largest Lyapunov exponent A(x) of the 
sequence A”(x),2> 1 is a limit, and not just a lim 
sup, at almost every point. 


Multiplicative Ergodic Theorem 


The Oseledets theorem states that the same holds 
for all Lyapunov exponents. Namely, for ji-almost 
every x€ M there exists k= k(x) € (1,...,d), a 
filtration 


7 k-1 pk. 
Raned R E ESE b 


and numbers àı(x)< --- < A,(x) such that 
lim * log |A" (x)], = Alx) D 


for all v € F.\FU'! and i € (1,..., Kk). 

The Lyapunov exponents A;(x), and their number 
k(x), are measurable functions of x and they are 
constant on orbits of the transformation f. In 
particular, if the measure p is ergodic then k and 
the A; are constant on a full jjmeasure set of 
points. The subspaces F' also depend measurably 
on the point x and are invariant under the linear 
cocycle: 


A(x): F. = Fry 


It is in the nature of things that, usually, these 
objects are not defined everywhere and they depend 
discontinuously on the base point x. 

When the transformation f is invertible, one 
obtains a stronger conclusion, by applying the 
previous kind of result also to the inverse of the 
cocycle. Namely, assuming that log’ ||A~'|| is also 
in L'(u) one gets that there exists a 
decomposition 


£, =E1@---@E* 


defined at almost every point and such that 
A(x) - E! — = Ex, and 


lim. log |A” (œ), = Xx) 7 
for all v € Et. different from zero and all i € 
(1,..., k]. These Oseledets subspaces E‘ are related 
to the subspaces Fi. through 


= DE 
i=] 


Hence, dim E! = dim F! — dim F^! is the multipli- 
city of the Lyapunov exponent A;(x). 

The angles between any two Oseledets subspaces 
decay — along orbits of f: 


ie =o 


for every i Æj and almost every point. These facts 
imply the regularity condition mentioned previously 
and, in particular, 


lim - - log angle (Ein 


k 
jim o- -= bns | det A" (x do x)dim E [8] 


Consequently, for cocycles with values in SL(d, R), 
the sum of all Lyapunov exponents, counted with 
multiplicity, is identically zero. 

As we are dealing with almost certain properties, 
we may generally restrict the vector bundle to some 
full measure subset over which it is trivial. Then 
each fiber €, is identified with the space Rf, and we 
may think of A(x) as a d xd matrix. Then 
A,(x)=A(f"(x)) is a stationary random process 
relative to (f,j). Thus, in this context it is no 
serious restriction to view a linear cocycle as a 
stationary random process with values in the linear 
group GL(d, R) of invertible d x d matrices. 

Furthermore, given any :such random process 
A,,n 20, one may consider its normalization 
B, — A,/|detA,|. The Lyapunov exponents of the 
two random processes A,,7: > 0, and B,,7 > 0, differ 
by the time average 


lim nS log det (x)| 


;-0 


of the determinant. The Birkhoff ergodic theorem 
ensures that the time average is well defined almost 
everywhere, as long as the function log | det A| is in 
L'(u); this is the case, for instance, if both 
log’ ||A*!|| are integrable. This relates the general 
case to random processes with values in the special 
linear group SL(d,R) of dxd matrices with 
determinant +1. 


Lyapunov Exponents and Strange Attractors 351 


The Oseledets theorem was extended by D Ruelle 
to certain linear cocycles in infinite dimensions. He 
assumes that the A(x) are compact operators on a 
Hilbert space H and log'||A| is in L'(u). The 
conclusion is the same as in finite dimensions, 
except that the filtration 


i 9 i 
LR <e € FH 


may involve infinitely many subspaces, and the 
Lyapunov exponents may be —oo. There is also a 
version for cocycles over invertible transforma- 
tions, where one assumes each A(x) to be invertible 
and the sum of a unitary operator with a compact 
operator, such that both log||A*|| are integrable. 
The conclusion is that there exists an Oseledets 
decomposition H=E!@---@®EL@--- at almost 
every point, with finitely or countably many 
factors. 


Random Matrices 


Relation [8] implies that, for SL(d, R) cocycles, if 
there is only one Lyapunov exponent (with full 
multiplicity) then it must be zero. When this 
happens, the theory contains no information on the 
behavior of the iterates A"(x) - v, apart from the fact 
that there is no exponential growth nor decay of 
their norms. Thus, the question naturally arises 
under which conditions is there more than one 
Lyapunov exponent or, equivalently, under which 
conditions is the largest Lyapunov exponent strictly 
positive. 

This problem was first addressed by H Furstenberg 
for products of independent random variables, 
corresponding to the following class of linear 
cocycles. Let v be a probability measure on the 
group G — GL(d, R). rues M=G and uy sv" 
(or M—- G^ and ju —v^), and let f: M— M be the 
shift map 


f ((a%);) 


It is clear that pz is invariant and also ergodic for the 
transformation f. Consider the cocycle .A:£— € 
defined by €=M x R? and 


A((oj), )-v=ao-v 


= (Qj+1); 


Clearly, 
A" ((aj),) ‘oe = G7" 


Corresponding to the hypothesis of the multiplicative 
ergodic theorem, assume that log” |la|| (and 
log" ||a~'||) are v-integrable functions of the matrix a. 

Furstenberg’s theorem states that if the closed 
group G(v) generated by the support of v is 


Q1O0 * U 
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noncompact and strongly irreducible in R^ then 
the largest Lyapunov exponent of the cocycle .A 
is strictly positive. Strong irreducibility means 
that there exists no finite union of subspaces of 
R^ that is invariant under all elements of the 
group. Improvements, extensions, and alternative 
proofs have been obtained by several authors since 
then. 

Especially, Y Guivarc'h and A Raugi provided 
conditions under which there are exactly d distinct 
Lyapunov exponents or, in other words, the 
multiplicity of every Lyapunov exponent is equal 
to 1. A matrix semigroup has the contraction 
property if there exists a sequence of elements h, 
and a probability measure on the projective space 
of R^ that gives zero weight to any projective 
subspace, such that the images (5,),7: of m under 
the 5, converge to a Dirac mass in the projective 
space. They proved that if the closed semigroup 
H(v) generated by the support of the probability v 
is strongly irreducible and has the contraction 
property then the largest Lyapunov exponent has 
multiplicity 1. Applying this to the exterior 
powers of the cocycle, one obtains sufficient 
conditions for simplicity of the other Lyapunov 
exponents as well. 

This statement has been improved by I Ya 
Gol'dsheid and G A Margulis, who formulated the 
hypotheses in terms of the algebraic closure G(v) of 
the semigroup H(v). They assumed that G(v) has the 
contraction property and the connected component 
of the identity inside G(v) is irreducible in R^, 
meaning that its elements do not have any common 
invariant subspace. Then the largest Lyapunov 
exponent is simple. 


Schrodinger Cocycles 


The one-dimensional discrete Schródinger equation 
is the second-order difference equation 


= (Heed + Uy, 1) + Vitta = Eu, [9] 


derived from the stationary Schródinger equation in 
dimension 1 by space discretization. Here the energy 
E is a constant and V,- V(f"(0), where the 
potential V(-) is a bounded scalar function and 
f:M—M is a transformation preserving some 
probability measure u on M. In what follows, we 
take yz to be ergodic. Equation [9] may be rewritten 
as a first-order relation, 


Uy +} = V, -E —i Hy 
Un] i ] 0 Un 


Hence, it may also be interpreted as a linear cocycle 
2. -— 
A over f, where the vector bundle is € — M x R^ and 


A(0) = iuto à] 10) 


takes values in SL(R,2). By ergodicity, the Lyapu- 
nov exponents are essentially independent of the 
base point 0. Let A(E) denote the largest exponent: 
by the relation [8], the other one is —X(E). 

The Lyapunov exponent A(E) is related to the 
spectral theory of the linear operators Lo, 


(Lou), = —(1541 is Hs 1) + Vig 


on the space #(Z) of complex square-integrable 
sequences uy, n € Z. These are bounded Hermitian 
operators and so the spectra are compact subsets of R. 
Using the assumption that ju is ergodic, one can prove 
that the spectrum spec(£,) is constant almost every- 
where. If the transformation f is minimal, the spectrum 
is even independent of the point 0. Moreover, for all 
energies, 


A(E) > const. dist(E, spec(£;)) 


In particular, \(E) is always positive on the comple- 
ment of the spectrum. 

A fundamental problem (Anderson localization) is 
to decide when the spectrum is pure-point. This is 
reasonably well understood for a few classes of base 
dynamics only, for example, the very chaotic systems 
such as Bernoulli and Markov processes (random 
potentials) or uniformly hyperbolic maps and flows, 
or the irrational rotations on the d-dimensional torus 
(quasiperiodic potentials). In the latter case, the 
results are more complete when there is only one 
frequency (d — 1). It was shown by K Ishii and by L 
Pastur that if A(E) is positive for almost all values of 
E in some Borel set then the absolutely continuous 
part of the spectrum is essentially disjoint from that 
set. The converse is also true (due to S Kotani). Thus, 
checking that A(E) is positive is an important step 
towards proving localization. 

A very general criterion for positivity of the 
Lyapunov exponent was obtained by Kotani. Namely, 
he proved that if the potential is not deterministic then 
A(E) is positive for almost all E. In particular, for 
nondeterministic potentials the absolutely continuous 
spectrum is empty, almost surely. In simple terms, the 
hypothesis means that from the values of the potential 
for negative 7 one cannot determine the values for 
positive . More formally, one calls the potential 
deterministic if every V,,,7 > 0 is almost everywhere a 
measurable function of {V,:2<0}. For instance, 
quasiperiodic potentials are deterministic, whereas 
Bernoulli potentials are not. 


Subharmonicity Method 


Let D" be the set of complex vectors (21,...,x,,) € C" 
such that |z;| <1 for all j and let T" be the subset 
defined by |z;|21 for all j. Let f: T" —' T" and 
A:T” — SL(d,R) be continuous maps that admit 
holomorphic extensions to the interior of D" with 
f(0) —0. Assume that f preserves the natural (Haar) 
measure jų on T", Let 


MA, p) = ] Nds 


where A(z) denotes the largest Lyapunov exponent 
for the cocycle defined by A over f. It also follows 
from the subadditive ergodic theorem that 


1 LI 
XA. p) = lim - | log ||A" (z)l|du 
: T" 


M Herman observed that, since the function 
log ||A"(z)|| is plurisubharmonic on D", one may 
use the maximum principle to conclude that 


1 n 1 nyf 
| | .IoellA" G)lldn > log lA" 0I 
n Jm n 


Then, taking the limit when » — oc one obtains that 
MA, u) > p(A) [11] 


where p (A) denotes the spectral radius of the matrix 
A(0). Starting from this observation, he developed a 
very effective method for bounding Lyapunov 
exponents from below, that received several applica- 
tions and extensions, in particular, to the theory of 
Schrödinger cocycles with quasiperiodic potentials. 
The best-known application is the following bound 
for integrated Lyapunov exponents of two-dimen- 
sional cocycles. Let f: M—M be a continuous 
transformation on a compact metric space, preserving 
some probability measure j4, and A: M — SL(2, R) be 
a continuous map. For each fixed 0, let AR; be the 
cocycle obtained by multiplying A(x), at every point 
x, by the rotation of angle 0. Herman proved that 


ZI. JM 


(A Avila and J Bochi later showed that the equality 
holds) where 


| 
N(x) = log IA COLLE TAG 


Apart from the exceptional case when A acts by 
rotation at every point in the support of u, the right- 
hand side of the inequality is positive, and so the 
Lyapunov exponent of the cocycle AR, is positive 
for many values of 0. 


Lyapunov Exponents and Strange Attractors 353 


Nonuniform Hyperbolicity 


The prototypical example of a linear cocycle is the 
derivative of a smooth transformation on a mani- 
fold. More precisely, let M be a finite-dimensional 
manifold and f : M — M be a diffeomorphism, that 
is, a bijective smooth map whose derivative Df(x) 
depends continuously on x and is an isomorphism at 
every point. Let € — TM be the tangent bundle to the 
manifold and .A— Df be the derivative. If M is 
compact or, more generally, if the norms of both Df 
and its inverse are bounded, then the hypothesis in 
Oseledets theorem is automatically satisfied for any 
f-invariant probability u. Lyapunov exponents yield 
deep geometric information on the dynamics of the 
diffeomorphism, especially when they do not vanish. 
For most results that we mention in the sequel, one 
needs the derivative Df to be Hólder continuous: 


|| Df (x) — Df (y)]| € const. d(x,y)" 


Let E; be the sum of the Oseledets subspaces 
corresponding to negative Lyapunov exponents. 
Pesin's stable manifold theorem states that there 
exists a family of embedded disks W} (x) tangent to 
E; at almost every point and such that the orbit of 
every y € WÌ (x) is exponentially asymptotic to the 


orbit of x. This lamination ( W^(x)] is invariant, in 
the sense that 


f (W^(x)) c WC (x)) 


and has an “absolute continuity” property. There 
are analogous results for the sum E? of the Oseledets 
subspaces corresponding to positive Lyapunov 
exponents. 

The entropy of a partition P of M is defined by 


b. (f, P) = lim - H,(P") 


where 7" is the partition into sets of the form 
P=Po Nf (P1)n-:-nf "(P,) with P; € P and 


H,(P") = 5 ' —p(P) log u(P) 
Pep" 


The Kolmogorov-Sinai entropy 5,(f) of the system 
is the supremum of 4,,(f,P) over all partitions P 
with finite entropy. The Ruelle-Margulis inequality 
says that h,,(f) is bounded above by the average sum 
of the positive Lyapunov exponents. A major result 
of the theory, Pesin's entropy formula, asserts that if 
the invariant measure j; is smooth (e.g., a volume 
element) then the two invariants coincide: 


k 
hf) = I» Aj | dii 
« j=1 
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A complete characterization of the invariant mea- 
sures for which the entropy formula is true was 
given by F Ledrappier and L S Young. 

The invariant measure p is called hyperbolic if all 
Lyapunov exponents are nonzero at almost every 
point. Hyperbolic measures are exact dimensional: 
the pointwise dimension 


exists at almost every point, where B,(x) is the 
neighborhood of radius r around x. This fact was 
proved by L Barreira, Ya Pesin, and J Schmeling. Note 
that it means that the measure j(B,(x)) of neighbor- 
hoods scales as 74 when the radius r is small. 

Another remarkable feature of hyperbolic mea- 
sures, proved by A Katok, is that periodic motions 
are dense in their supports. More than that, 
assuming the measure is nonatomic, there exist 
Smale horseshoes H, with topological entropy 
arbitrarily close to the entropy 5,(f) of the system. 
In this context, the topological entropy h(f,H,) may 
be defined as the exponential rate of growth, 


jim zlog JHx € Hy: f*(x) = x} 


of the number of periodic points on H,,. 


Generic Systems 


Given any area-preserving diffeomorphism on any 
surface M, one may find another whose first 
derivative is arbitrarily close to the initial one and 
which has Lyapunov exponents identically zero at 
almost every point, or else is globally uniformly 
hyperbolic (Anosov). This surprising fact was 
discovered by R Mané, and a complete proof was 
given by J Bochi. Uniform hyperbolicity means that 
the tangent bundle admits a Df-invariant splitting 


IM-2EGOGE 


such that the line bundle F5 is uniformly contracted 
and E" is uniformly expanded by the derivative. It is 
well known that Anosov diffeomorphisms can only 
occur if the surface is the torus T°. 

In fact, the theorem of Marié-Bochi is stronger: 
for a residual subset (a countable intersection of 
open dense sets) of all once-differentiable area- 
preserving diffeomorphisms on any surface, either 
the Lyapunov exponents vanish almost everywhere 
or the diffeomorphism is Anosov. This shows that 
zero Lyapunov exponents are actually quite com- 
mon for surface diffeomorphisms that are only once- 
differentiable. Moreover, this theorem has been 


extended to diffeomorphisms on manifolds with 
arbitrary dimension, in a suitable formulation, by 
J Bochi and M Viana. 

However, this phenomenon should be specific to 
systems with low differentiability. Indeed, already 
for Holder-continuous linear cocycles over chaotic 
transformations it is known that vanishing Lyapu- 
nov exponents can only occur with infinite codimen- 
sion. That is, unless the cocycle satisfies an infinite 
number of independent constraints, there exists 
some positive exponent. By “chaotic” we mean 
here that the invariant probability jy of the base 
transformation is assumed to be hyperbolic and to 
have local product structure: it is locally equivalent 
to a product of two measures, respectively, along 
stable and unstable sets. _ 

Under additional assumptions, one can even prove 
that all Lyapunov exponents have multiplicity 1 
outside an infinite-codimension subset. This follows 
from extensions of the Guivarc'h-Raugi criterion for 
certain linear cocycles over chaotic transformations, 
obtained by A Avila, C Bonatti, and M Viana. 


Strange Attractors 


This expression was coined by D Ruelle and 
F Takens in their celebrated study on the nature of 
fluid turbulence. E Hopf and also L D Landau and 
E M Lifshitz had suggested that turbulent motion 
arises from the existence in the phase space of 
invariant tori carrying quasiperiodic flows with 
large number of frequencies. Ruelle and Takens 
observed that dissipative systems such as viscous 
fluids do not generally have such quasiperiodic tori, 
and concluded that turbulence must be credited to a 
different mechanism: the presence of some "strange" 
attractor. 

While they did not propose a precise definition, 
two main features were mentioned: 


1. Complex geometry: a strange attractor is not 
reduced to an equilibrium point or a periodic 
solution of the system and, generally, should 
have a fractal structure. 

2. Chaotic dynamics: solutions accumulating on the 
attractor should be sensitive to their initial states. 


As more examples were found, it became appar- 
ent that the above two features do not always come 
together. This led to two types of definitions in the 
literature, depending on whether one emphasizes the 
geometry or the dynamics. We adopt the second 
point of view, and propose to define the strange 
attractor as one carrying an invariant ergodic 
physical measure which has some positive Lyapunov 
exponent. The notion of physical measure will be 


defined near the end. The condition on the Lyapu- 
nov exponent ensures that the dynamics near the 
attractor is (exponentially) sensitive to the initial 
states. 


Lorenz-Like Attractors 


The uniformly hyperbolic attractors introduced by 
S Smale provided an interesting class of examples of 
strange attractors, both chaotic and fractal. Perhaps 
more striking, given that they originated from a 
concrete problem in fluid dynamics, were the 
strange attractors introduced by E N Lorenz. The 
Lorenz system of differential equations, 


X = —ox + oy, mg 10 
y= i = y=; r= 28 [12] 
z = xy — bz, b= 843 


was derived from Lord Rayleigh’s model for 
thermal convection, by Fourier expansion of the 
stream function and temperature, and truncation of 
all but three modes. Lorenz observed that its 
solutions depend sensitively on their initial states. 
Consequently, predictions based on the numerical 
integration of the equations may turn out to be 
very inaccurate, given that the initial data obtained 
from experimental measurements are never com- 
pletely precise. This remarkable observation 
brought the issue of predictability in deterministic 
systems to a whole new light and motivated intense 
investigation of this and many other chaotic 
systems. 

The dynamical behavior of the eqns [12] was first 
interpreted through certain geometric models where 
the presence of strange attractors, both chaotic and 
fractal, could be proved rigorously. It was much 
harder to prove that the original eqns [12] them- 
selves have such an attractor. This was achieved just 
a few years ago, by W Tucker, by means of a 
computer-assisted rigorous argument. At about the 
same time, a mathematical theory of Lorenz-like 
attractors in three-dimensional space was developed 
by C Morales, M J Pacifico, and E Pujals. In 
particular, this theory shows that uniformly hyper- 
bolic attractors and Lorenz-like attractors are the 
only ones which are robust under all small mod- 
ifications of the vector field. 


Hénon-Like Attractors 


Starting from the work of Lorenz, many models of 
strange attractors have been found and described to 
some extent, often related to concrete problems. 
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From a mathematical point of view, it is usually 
hard to give even a rough description of the 
dynamics in the chaotic regime. However, this was 
especially successful for the family of strange 
attractors introduced by M Hénon. He considered 
a very simple nonlinear system, particularly suited 
for numerical experimentation: the transformation 


f(x. y) = (1 — ax* + by, x) [13] 


where a and b are constant parameters. In a 
breakthrough, M Benedicks and L Carleson were 
able to prove that, for a set of parameter values with 
positive probability, this transformation has some 
nonhyperbolic attractor such that the orbits accu- 
mulating on it are sensitive to the starting point. The 
system [13] is also a model for many other 
situations, including the phenomenon of creation of 
homoclinic motions as parameters unfold, and the 
conclusions of Benedicks and Carleson have been 
extended to such situations, starting from the work 
of L Mora and M Viana. 

Moreover, a detailed theory of Hénon-like attrac- 
tors has been developed by M Benedicks, M Viana, 
D Wang, L S Young, and other authors. It follows 
from this theory that these attractors carry an 
invariant ergodic probability measure jp which 
describes the statistical behavior of almost all 
trajectories f/(x),j7>1, that accumulate the 
attractor: 


lim Sofe) = fed 
j= | 


nooH 


for any continuous function y. This property 
implies that, despite the fact that it is supported 
on a zero-volume set, the measure pz is, in some 
sense, physically observable. For this reason, one 
calls it a physical measure. In other words, time 
averages along typical orbits in the domain of 
attraction coincide with the space averages deter- 
mined by the probability u. Another property with 
physical relevance is that jz is the zero-noise limit of 
the stationary measures associated to the Markov 
chains obtained by adding random noise to f. One 
says that the system (f, u) is stochastically stable. 


See also: Chaos and Attractors; Dissipative Dynamical 
Systems of Infinite Dimension; Ergodic Theory; Fractal 
Dimensions in Dynamics; Generic Properties of 
Dynamical Systems; Gravitational N-Body Problem 
(Classical); Homoclinic Phenomena; Hyperbolic 
Dynamical Systems; Lagrangian Dispersion (Passive 
Scalar); Nonequilibrium Statistical Mechanics: Interaction 
between Theory and Numerical Simulations; Random 
Dynamical Systems; Synchronization of Chaos. 
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Introduction 


There is no theory so far of irreversible processes that 
is of the same generality as equilibrium statistical 
mechanics and presumably it may not exist. While in 
equilibrium the Gibbs distribution provides all the 
information and no equation of motion has to be 
solved, the dynamics plays the major role in none- 
quilibrium. The theory illustrated below refers to 
stationary states that are not restricted to being close 
to equilibrium, and for a wide class of models it can be 
shown to be exact. In this case one begins to see the 
appearance of some general principles. 

In equilibrium statistical mechanics, there is a well- 
defined relationship, established by  Boltzmann, 
between the probability of a state and its entropy. 
This fact was exploited by Einstein to study thermo- 
dynamic fluctuations. When we are out of equilibrium, 
for example, in a stationary state of a system in contact 
with two reservoirs, it is not completely clear how to 
define thermodynamic quantities such as the entropy 
or the free energy. One possibility is to use fluctuation 
theory to define their nonequilibrium analogs. In fact 
in this way, extensive quantities can be obtained, 
although not necessarily simply additive due to the 
presence of long-range correlations which seem to be a 
rather generic feature of nonequilibrium. This possibil- 
ity has been pursued in recent years leading to a 
considerable number of interesting results. One can 
recognize two main lines. 


|. Exact calculations in simplified models. This is 
well exemplified by the work of Derrida et al. 
(2002). 

2. A general treatment of a class of continuous time 
Markov chains for which the simplified models 
provide examples. This is the point of view 
developed by Bertini et al. (2002, 2004). 


Both approaches have been very effective and of course 
give the same results when a comparison is possible. 


The second approach seems to encompass a wide class 
of systems and has the advantage of leading to 
equations which apply to very different situations. 
This is the point of view we shall adopt in the 
following. The question whether there are alternative 
more natural ways of defining nonequilibrium entro- 
pies or free energies is, for the moment, open. 


Boltzmann-Einstein Formula 


The Boltzmann-Einstein theory of equilibrium ther- 
modynamic fluctuations, as described for example in 
the book Physique Statistique by Landau-Lifshitz, 
states that the probability of a fluctuation from 
equilibrium in a macroscopic region of fixed volume 
V is proportional to exp{VAS/k}, where AS is the 
variation of entropy density in the region calculated 
along a reversible transformation creating the 
fluctuation and k is the Boltzmann constant. 

This formula was derived by Einstein simply by 
inverting the Boltzmann relationship between entropy 
and probability. He considered this relationship as a 
phenomenological definition of the probability of a 
state. 

Einstein theory refers to fluctuations from an 
equilibrium state, that is from a stationary state of a 
system isolated or in contact with reservoirs character- 
ized by the same chemical potentials so that there is no 
flow of heat, electricity, chemical substances, etc., 
across the system. When in contact with reservoirs, AS 
is the variation of the total entropy (system + 
reservoirs) which, for fluctuations of constant volume 
and temperature, is equal to -AF/T, where AF is the 
variation of the free energy of the system and T the 
temperature. In the following, we refer to AF/T, our 
main object of study, as the entropy and use the letter $ 
for it but no confusion should arise. 

The important question we address is then: what 
happens if the system is stationary but not in 
equilibrium, that is, flows of physical quantities are 
present due to external fields and/or different chemical 
potentials at the boundaries? To start with it is not 
always clear whether a closed macroscopic dynamical 
description is possible. If the system admits such a 
description of the kind provided by hydrodynamic 
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equations, a fact which can be rigorously established in 
simplified models, a reasonable goal is to find an 
explicit connection between time-independent thermo- 
dynamic quantities (e.g., the entropy) and dynamical 
macroscopic properties (e.g., transport coefficients). 
As we shall see, the study of large fluctuations provides 
such a connection. It leads in fact to a dynamical 
theory of the entropy which is shown to satisfy a 
Hamilton-Jacobi equation (HJE) in infinitely many 
variables requiring the transport coefficients as input. 
Its solution is straightforward in the case of homo- 
geneous equilibrium states and highly nontrivial in 
stationary nonequilibrium states (SNSs). In the first 
case we recover a well-known relationship widely used 
in the physical and physico-chemical literature. There 
are several one-dimensional models, where the HJE 
reduces to a nonlinear ordinary differential equation 
which, even if it cannot be solved explicitly, leads to 
the important conclusion that the nonequilibrium 
entropy is a nonlocal functional of the thermodynamic 
variables. This implies that correlations over macro- 
scopic scales are present. The existence of long-range 
correlations is probably a generic feature of SNSs and 
more generally of situations where the dynamics is not 
time-reversal invariant. As a consequence if we divide 
a system into two subsystems, the entropy is not 
necessarily simply additive. 

The first step toward the definition of a non- 
equilibrium entropy is the study of fluctuations in 
macroscopic evolutions described by hydrodynamic 
equations. In a dynamical setting, a typical question 
one may ask is the following: what is the most 
probable trajectory followed by the system in the 
spontaneous emergence of a fluctuation or in its 
relaxation to an equilibrium or a stationary state? To 
answer this question, one first derives a generalized 
Boltzmann-Einstein formula from which the most 
probable trajectory can be calculated by solving a 
variational principle. The entropy is related to the 
logarithm of the probability of such a trajectory and 
satisfies the HJE associated to the variational principle. 

For states near equilibrium, an answer to this type of 
questions was given by Onsager and Machlup in 1953. 
The Onsager-Machlup theory gives the following 
result under the assumption of time reversibility of 
the microscopic dynamics. In the situation of a linear 
hydrodynamic equation and small fluctuations, that is, 
close to equilibrium, the most probable creation and 
relaxation trajectories of a fluctuation are time 
reversals of one another. This conclusion holds also 
in nonlinear hydrodynamic regimes and without the 
assumption of small fluctuations. This follows from 
the study of concrete models. In SNSs, on the other 
hand, time-reversal invariance is broken and the 
creation and relaxation trajectories of a fluctuation 
are not time reversals of one another. 


In the following we refer to boundary-driven 
stationary nonequilibrium states, for example, a 
thermodynamic system in contact with reservoirs 
characterized by different temperatures and chemi- 
cal potentials, but there is no difficulty in including 
an external field acting in the bulk. 


Microscopic and Macroscopic Dynamics 


We consider many-body systems in the limit of 
infinitely many degrees of freedom. The basic general 
assumption of the theory is Markovian evolution. 
Microscopically, we assume that the evolution is 
described by a Markov process X, which represents 
the state of the system at time 7. This hypothesis 
probably is not so restrictive, because the dynamics of 
Hamiltonian systems interacting with thermostats 
finally is also reduced to the analysis of a Markov 
process. Several examples are discussed in the litera- 
ture. To be more precise, X, represents the set of 
variables necessary to specify the state of the micro- 
scopic constituents interacting among themselves and 
with the reservoirs. The SNS is described by a 
stationary, that is, invariant with respect to time shifts, 
probability distribution P, over the trajectories of X,. 
Macroscopically, the usual interpretation of 
Markovian evolution is that the time derivatives 
of thermodynamic variables p; at a given instant of 
time depend only on the ps and the affinities 
(thermodynamic forces) OS/Op; at the same instant 
of time. Our next assumption can then be 
formulated as follows: the system admits a 
macroscopic description in terms of density fields 
which are the local thermodynamic variables. For 
simplicity of notation, we assume that there is 
only one thermodynamic variable (e.g., p, the 
density). The evolution of the field p= p(t, u), 
where t and u are the macroscopic time and 
space coordinates (see below), is given by diffu- 
sion-type hydrodynamic equations of the form 


= D(p) [1] 


The interaction with the reservoirs appears as 
boundary conditions to be imposed on solutions of 
[1]. We assume that there exists a unique stationary 
solution p of [1], that is, a profile p(u), which 
satisfies the appropriate boundary conditions and is 
such that D(p) =0. This holds if the diffusion matrix 
D; ;(p) in [1] is strictly elliptic, namely there exists a 
constant c > 0 such that D(p) > c (in matrix sense). 

These equations derive from the underlying 
microscopic dynamics through an appropriate 
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scaling limit in which the microscopic time and 
space coordinates 7,x are rescaled as follows: 
t—T/N^,u—x/N, where N represents the linear 
size of the system. For lattice systems, N is an 
integer. The hydrodynamic equation [1] repre- 
sents a law of large numbers with respect to the 
probability measure P, conditioned on an initial 
state Xo. The initial conditions for [1] are 
determined by Xo. Of course, many microscopic 
configurations give rise to the same value of 
p(0,u). In general, p=p(t,u) is an appropriate 
limit of a local observable px(X-,) as the number 
N of degrees of freedom diverges. 

The hypothesis of Markovian evolution is also the 
basis of the 1931 Onsager's theory of irreversible 
processes near equilibrium. Onsager, however, did not 
rely on any microscopic model and assumed, near the 
equilibrium, linear hydrodynamic equations or regres- 
sion equations as he called them. His equations, 
ignoring space dependence, were of the form 


pi —— 2, Dip; |2] 


The diffusion matrix D is related to Onsager 
transport matrix x and the entropy by the 
relationship 


D — xs [3] 


where the elements of s are 0^S/0p;Op;. The matrix 
x is defined by the relationship between flows and 
affinities 


i OS 
pi =— 2. Xij Ba; [4] 


] 


The indices 7 here label different thermodynamic 
variables. The matrix x is symmetric, a property 
known as Onsager reciprocity. Equations [2] and [3] 
follow by developing the entropy near an equilib- 
rium state, that is, by taking a quadratic expression 
as an approximation. The minus sign in eqn [4] is 
due to our convention in which the entropy has the 
same sign as the free energy. 

Equation [3] permits to reconstruct the entropy 
from the knowledge of the coefficients D and x and 
has been widely used especially in physical chem- 
istry. In SNSs, eqn [3] is replaced by a Hamilton- 
Jacobi-type equation for the entropy. 


Dynamical Boltzmann-Einstein Formula 


The basic assumption is that the stationary ensemble 
P, admits a principle of large deviations describing 
the fluctuations of the thermodynamic variables 
appearing in the hydrodynamic equation. This 
means the following. The probability that for large 


N, the evolution of the random variable pw deviates 
from the solution of the hydrodynamic equation and 
is close to some trajectory f(t) is exponentially small 
and of the form 


Pa(oN(Xx2) ~ p(t),t € I1. t2]) 
e e NA ISC) Ju 1] 


= e Na 19) (0) [5] 


where d is the dimensionality of the system, ](/) is a 
functional which vanishes if A(t) is a solution of [1] 
and S(f(t;)) is the entropy cost to produce the initial 
density profile (tı). We normalize S so that 
S(p)=0. Therefore, ](6) represents the extra cost 
necessary to follow the trajectory (t). Finally, 
DNUXw2,) ~ P(t) means closeness in some metric 
and = denotes logarithmic equivalence as N — ov. 
Equation [5] is the dynamical generalization of the 
Boltzmann-Einstein formula. Experience with many 
models justifies this assumption. 

To understand how [5] leads to a dynamical 
theory of the entropy, we discuss its properties 
under time reversal. Let us denote by 0 the time 
inversion operator defined by 0X, — X ,. The prob- 
ability measure P* describing the evolution of the 
time-reversed process X? is given by the composition 
of P4 and 6", that is, 


PS (Xt = rT € [n.72]) 
= Pol Mer = Pury TE [-72, —n]) [6] 


Let L be the generator of the microscopic 
dynamics. We remind that L induces the evolution 
of observables (functions on the state space) accord- 
ing to the equation O,Ex,[f(X.)] ^ Ex,[G.f )(X;)], 
where Ex, stands for the expectation with respect to 
P4 conditioned on the initial state Xo. 

The time-reversed dynamics, that is, the dynamics 
which inverts the direction of the fluxes through the 
system, for example, heat flows under this dynamics 
from lower to higher temperatures, is generated by 
the adjoint L* of L with respect to the invariant 
measure j: 


E^[fLg) = E"((L'f)g 7] 


The measure ji, which is the same for both processes, is 
a distribution over the configurations of the system 
and formally satisfies pL — 0. The expectation with 
respect to u is denoted by E" and f, g are observables. 
We note that the probability Ps, and therefore P;, 
depends on the invariant measure p. The finite- 


dimensional distributions of P, are in fact given by 


Pal Xs: = Qr. xx N T — Pr) 
= Lr, ) Pr—n (Pr =S Pr) eem De ai (Dr, — Ptr, ) [8] 
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where p.(ó; — $2) is the transition probability. 
According to [6] the finite-dimensional distributions 
of P7, are 


PER = Gar, mds) 
= Mln) Pn (Gn — On) e D; (Oma S 0m) 
= U(r) Prani (Orn — Pins) 7 Pr—n (On > 04) 
[9] 


In particular, the transition probabilities p.(ó1 — ó») 
and p,.'(ó1 — 2) are related by 


u(i) P-(ó1 > 2) = n(é2) p. (62 + G1) [10] 


This relationship reduces to the well-known detailed 
balance condition if p.(ó1 — $2) =p,*(¢1 > $3). 

We require that also the evolution generated by 
L* admits a hydrodynamic description, that we call 
the adjoint hydrodynamics, which, however, is not 
necessarily of the same form as [1]. In fact, we 
consider models in which the adjoint hydrodynamics 
is nonlocal in space. 

In order to avoid confusion, we emphasize that what 
is usually called an equilibrium state for a reversible 
dynamics, as distinguished from an SNS, corresponds 
to the special case L* — L, that is, the detailed balance 
principle holds. In such a case, P. is invariant under 
time reversal and the two hydrodynamics coincide. 

We now derive a first consequence of our 
assumptions, that is, the relationship between the 
functionals J and I* associated to the dynamics L 
and L* by [5]. From eqn [6], it follows that 


Io. s (Ê) = I-t,- (80) [11] 


with obvious notations. More explicitly, this equa- 
tion reads 


S(p(t1)) + Jit, as) re S(p(t2)) + Ji-r.—1,1 (8P) [12] 


where (ti), (t2) are the initial and final points of 
the trajectory and S(ó(t;) the entropies associated 
with the creation of the fluctuations f(tj) starting 
from the SNS. The functional /* vanishes on the 
solutions of the adjoint hydrodynamics. To compute 
J*, it is necessary to know the entropy S. 

We consider now the following physical situation. 
The system is macroscopically in the stationary state 
p at t= —oc, but at t =Q we find it in the state p. We 
want to determine the most probable trajectory 
followed in the spontaneous creation of this fluctua- 
tion. According to [5], this trajectory is the one that 
minimizes / among all trajectories p(t) connecting p 
to p in the time interval [—oo,0]. From [12], 
recalling that $(p) — 0, we have that 


Ji-».o)(9) = SCP) + Jio, (9) 13] 


The right-hand side is minimal if Jio. (8) — 0, that 
is, if 0p is a solution of the adjoint hydrodynamics. 
The existence of such a relaxation solution is due to 
the fact that the stationary solution p is attractive 
also for the adjoint hydrodynamics. We have there- 
fore the following consequences: 


In a SNS the spontaneous emergence of a macroscopic 
fluctuation takes place most likely following a trajec- 
tory which is the time reversal of the relaxation path 
according to the adjoint hydrodynamics. 


This implies that the entropy is related to / by 
S(p) = inf J|-oc, 0)(A) [14] 


where the minimum is taken over all trajectories (t) 
connecting p to p. 

We note that the reversibility of the microscopic 
process X,, which we call microscopic reversibility, 
is not needed in order to deduce the Onsager- 
Machlup result (ie., that the trajectory which 
creates the fluctuation is the time reversal of the 
relaxation trajectory). In fact, Onsager-Machlup 
result holds if and only if the hydrodynamics 
coincides with the adjoint hydrodynamics, which 
we call macroscopic reversibility. Indeed, it is 
possible to construct microscopic nonreversible 
models, LÆ L*, in which the hydrodynamics and 
the adjoint hydrodynamics coincide. 

Spontaneous fluctuations, including Onsager- 
Machlup  time-reversal symmetry, have been 
Observed in stochastically perturbed reversible elec- 
tronic devices. In nonreversible systems, an asym- 
metry between the emergence and the relaxation of 
fluctuations has been observed. The above discus- 
sion provides the explanation. 


The Hamilton-Jacobi Equation and Its 
Consequences 


We assume that the functional / has a density (which 
plays the role of a Lagrangian), that is, 


to 
-— J dtL(p(t),d,(t)) S) 
J fy 


Let us introduce the Hamiltonian H(p,H) as the 
Legendre transform of L(p,0,p), that is, 


H(p, H) = sup{(€,H)—L(p,8)} [16 


where (:,-) denotes integration with respect to the 
macroscopic space coordinates u. 
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Noting that H(p,0)=0, the Hamilton-Jacobi 
equation associated to [14] is 


n(» >.) = 0 [17] 
dp 


This is an equation for the functional derivative 
C(p)=6S/dp, but not all the solutions of the 
equation H(p, C(p)) — 0 are the derivatives of some 
functional. Of course, only those which are the 
derivative of a functional are relevant for us. 

We now specify the Hamilton—Jacobi equation 
[17] for boundary-driven lattice gases. For models 
with purely diffusive hydrodynamics [1], we expect 
a quadratic large deviation functional of the form 


1 [f^ 
Jta) = j dt( V- | (ip — D(p)). 
x(p) ! V (&yp — D(p))) [18] 


where D(p) is the right-hand side of the hydrody- 
namic equation [1], and by V ^!f we mean a vector 
field whose divergence equals f. The form [18], which 
can be derived for several models, is expected to be 
very general: the functional /(5) measures how much 
p differs from a solution of the hydrodynamics [1]. 
The matrix x(p) = x(p) with y(p) has the same role in 
our more general context, as the Onsager matrix in 
[4]. This form of / is also typical for diffusion 
processes described by finite-dimensional Langevin 
equations (Freidlin-Wentzell theory). 

In this case, the Lagrangian £ is quadratic in 
O,p(t) and the associated Hamiltonian is given by 


H(p, H) - VH, x(p) VH) + (H,D(p)) X [19] 


so that the Hamilton-Jacobi equation [17] takes the 
form 


1 óS Os óS 
5 (ve ows) s (55.009) -0 |20] 


As is well known in mechanics, the Hamilton-Jacobi 
equation has many solutions and we must give a 
criterion to select the correct one. The criterion 
which the correct solution has to satisfy is that it 
must be a Lyapunov function with respect to the 
unique stationary state. 

It is a simple calculation to show that eqn [3] follows 
from HJE, if we look for a solution which is a local 
function of p. This is the right choice in equilibrium 
where correlations over macroscopic distances are not 
expected if the microscopic forces are short range. 

Out of equilibrium, it has been shown by direct 
calculation that for a special model, the symmetric 
simple exclusion, the entropy is a nonlocal function 
of the thermodynamic variables, that is, space 


correlations extend to macroscopic distances. This 
result can be derived in a simple way from HJE as 
we will discuss later. 

Lattice gases which do not conserve the number 
of particles do not give rise in general to a purely 
diffusive hydrodynamics but rather to a reaction 
diffusion equation. In this case, the large deviation 
functional will not have the quadratic form [18] and 
also the HJE will not be quadratic. An example in 
which particles can be created and destroyed is the 
so-called Kawasaki-Glauber dynamics. In this case, 
HJE has exponential nonlinearities. 


Nonequilibrium Fluctuation Dissipation Relation 


We now derive a twofold generalization of the 
celebrated fluctuation dissipation relationship: it is 
valid in nonequilibrium states and in nonlinear 
regimes. 

Such a relationship will hold provided the rate 
function /* of the time-reversed process is of the form 
[18] with D replaced by D*, the adjoint hydrodynamics, 


dp = D'(p) [21] 


with the same boundary conditions as [1]. 
If J* has the form 


"eu. ips — 
Juaj(9)-7[ d((V (àp-—D'(9), 


md 6i 


x9) v (àp-T'(p) [22] 
by taking the variation of eqn [12], we get 
óS 
D+ D= V: (xv) e3 


This relation can be verified explicitly for the 
nonequilibrium zero-range process which we discuss 
later and holds for several other models. It is also 
easy to check that the linearization of [23] around 
the stationary profile p yields a fluctuation dissipa- 
tion relationship which reduces to the usual one in 
equilibrium. 

The fluctuation dissipation relation [23] can be used 
to obtain the adjoint hydrodynamics from D(p) and 
68S /6p; the first is usually known and the second can be 
calculated from the Hamilton—Jacobi equation. 


H Theorem 


We show that the functional S is decreasing along 
the solutions of both the hydrodynamic equation [1] 
and the adjoint hydrodynamics 


Op = D'(p)-V- (xiv) — D(p) [24] 
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Let p(t) be a solution of [1] or [24]; by using the 
Hamilton-Jacobi equation [20], we get 


«0 [25] 


In particular, we have that (d/dt)S(p(t))=0 if and 
only if (6S/6p)(p(t)) — O. 

We remark that the right-hand side of [25] 
vanishes in the stationary state, that is, there is no 
internal entropy production due to the evolution. 
On the other hand, there is a steady entropy 
production due to the differences in the chemical 
potentials of the reservoirs. This is not discussed in 
this article. 


Decomposition of Hydrodynamics 


There is a structural property of hydrodynamics 
which follows from the HJE. The hydrodynamic 
equation can be decomposed as the sum of a 
gradient vector field and a vector field .A orthogonal 
to it in the metric induced by the operator K^, 
where Kf = —V - (x(p)Vf), namely 


D(p) -4V- (xt) | + A(p) [26] 


Similarly, using the fluctuation dissipation rela- 
tionship [23] for the adjoint hydrodynamics, we 
have 


with 


D()-3v.(xovi)-Ae p7 
Since A is orthogonal to 6S/ép, it does not contribute 
to the entropy production. The vector field A is odd 
under time reversal like a magnetic force. 

Both terms of the decomposition vanish in the 
stationary state, that is, when p=/p. Whereas in 
equilibrium the hydrodynamics is the gradient flow of 
the entropy $, the term .A(p) is characteristic of 
nonequilibrium states. Note that, for small fluctuations 
p = p, small differences in the chemical potentials at 
the boundaries, .A(p) becomes a second-order quantity 
and Onsager theory is a consistent approximation. 

Equation [26] is interesting because it separates 
the dissipative part of the hydrodynamic evolution 
associated to the thermodynamic force 68/6p and 


provides therefore an important physical informa- 
tion. Notice that the thermodynamic force 6S/6p 
appears linearly in the hydrodynamic equation 
even when this is nonlinear in the macroscopic 
variables. 

In general, the two terms of the decomposition 
[26] are nonlocal in space even if D is a local 
function of p. This is the case for the simple 
exclusion process discussed later. Furthermore 
while the form of the hydrodynamic equation does 
not depend explicitly on the chemical potentials, 
6S/6p and A do. 

To understand how the decomposition [26] arises 
microscopically, let us consider a stochastic lattice 
gas. Let 


L —XL-L*) +4(L-L") [28] 


be its Markov generator, where L* is the adjoint of 
L with respect to the invariant measure, namely the 
generator of the  time-reversed microscopic 
dynamics. The term L — L* behaves like a Liouville 
operator, that is, it is anti-Hermitian and, in the 
scaling limit, produces the term A in the hydro- 
dynamic equation. This can be verified explicitly in 
the boundary-driven zero-range model introduced in 
the next section. 

Since the adjoint generator can be written as 
L* =(L+ L*)/2 —(L—L*)/2, the adjoint hydro- 
dynamics must be of the form [27]. In particular, if 
the microscopic generator is self-adjoint, we get A= 0 
and thus D(p)=D*(p). On the other hand, it may 
happen that microscopic nonreversible processes, 
namely for which L Æ L*, can produce macroscopic 
reversible hydrodynamics if L — L* does not con- 
tribute to the hydrodynamic limit. 

The decompositions [26] and [27] remind of the 
electrical conduction in the presence of a magnetic 
field. Consider the motion of electrons in a 
conductor: a simple model is given by the effective 
equation 


1 1 
p- -e(Es pH) - 7p [29] 


where p is the momentum, e the electron charge, E 
the electric field, H the magnetic field, m the mass, 
c the velocity of the light, and 7 the relaxation time. 
The dissipative term p/r is orthogonal to the 
Lorentz force p ^ H. We define time reversal as the 
transformation p> —p, H — —H. The adjoint evo- 
lution is given by 


= 1 1 
p- (Ex AH) -7p [30] 
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where the signs of the dissipation and the electro- 
magnetic force transform in analogy to [26] and 
[27]. 

Let us consider in particular the Hall effect where 
we have conduction along a rectangular plate 
immersed in a perpendicular magnetic field H with 
a potential difference across the longer side. The 
magnetic field determines a potential difference 
across the other side of the plate. In our setting on 
the contrary, it is the difference in chemical 
potentials at the boundaries that introduces in the 
equations a “magnetic-like” term. There is therefore 
a kind of equivalence between certain externally 
applied fields and driving the system at the 
boundaries. 


Minimum Dissipation Principle 


In 1931 Onsager formulated, within his near 
equilibrium theory, a variational principle which 
shows that the hydrodynamic evolution minimizes 
at each instant of time a quadratic functional of p. 
He called this the “minimum dissipation principle.” 
We now show that the decomposition of the 
previous subsection leads to a natural exact general- 
ization of this principle. We want to construct a 
functional of the variables p and 5 such that the 
Euler equation associated to the vanishing of the 
first variation under arbitrary changes of j is the 
hydrodynamic equation [1]. We define the “dissipa- 
tion function” 


F(p, p) = ((p — A(p)), K (ò — A(p))) [31] 


and the functional 


®(p, p) = S(p) + F(p, p) 


ó8 |. ! 
= P) : (p — Alp)), 
K '(p — A(p))) [32] 


which generalize the corresponding Onsager’s defi- 
nitions (Onsager 1931a, b). The operator K has been 
defined in the previous subsection. 

It is easy to verify that 


6, — 0 [33] 


is equivalent to the hydrodynamic equation [1]. 
Furthermore, a simple calculation gives 


1 óS óS 
Fien =g VE xov) A 


that is, 2F on the hydrodynamic trajectories equals 
the entropy production rate as in Onsager’s near 
equilibrium approximation. 


The dissipation function for the adjoint hydro- 
dynamics is obtained by changing the sign of A 
in [31]. 


Entropy and Optimal Control 


There is an interesting interpretation of the entropy 
as a minimal cost to produce a fluctuation by 
externally acting on the system. The idea is to show 
that there exists a cost function which on the optimal 
control trajectory coincides with the entropy differ- 
ence with respect to the stationary state. 

We add an external perturbation v to the 
hydrodynamic equation 


np =35V -(D(p)Vp)+v=D(p)+v [35] 


We want to choose v so as to drive, with minimal 
cost, the system from its stationary state p to an 
arbitrary state p. A simple cost function is 


t» 
;| de ove» B6 
where p(s) is the solution of [35] and we recall that 
K(p)f ——V-(x(p)JVf). More precisely, given 
p(ti) ^ p, we want to drive the system to p(t;) — p 
by an external field v which minimizes [36]. This is 
a standard problem in control theory. Let 


21 = " 
V(p) =inf5 |. ds(v(s),K ‘(p(s))u(s)) (37 
ty 
where the infimum is taken with respect to all fields 
v which drive the system to p in an arbitrary time 
interval [t,,t2]. The optimal field v can be obtained 
by solving the Bellman equation which reads 


-p1 4 dV 
min{ 5 (VK “(p)e) - (Di) 22) =0 (38 
It is easy to express the optimal v in terms of V; we 
get 


Hence, [38] now becomes 


1/6V oy 5V 
5 (XK) 23 + (D,E) =0 [40] 


By identifying the cost functional V(p) with S(p), eqn 
[40] coincides with the Hamilton-Jacobi equation [20]. 

By inserting the optimal v [39] in [35] and 
identifying V with S, we get that the optimal 
trajectory p(t) solves the time-reversed adjoint 
hydrodynamics, namely 


Ó,p = —D'(p) [41] 


364 Macroscopic Fluctuations and Thermodynamic Functionals 


The trajectory of the spontaneous emergence of a 
fluctuation coincides therefore with the trajectory of 
minimal cost for the optimal control. The optimal 
field v does not depend on the nondissipative part A 
of the hydrodynamics. 


Models 


The general theory will now be illustrated by briefly 
describing models where it has been successfully 
applied. We consider examples of different nature in 
order to emphasize the generality and flexibility of 
the point of view developed in the previous section. 

We have chosen three examples in which the 
theory is used in different ways. The first one, the 
zero-range process, can be solved in a simple way so 
that the theory can be verified in detail. In the second 
one, the symmetric simple exclusion, we derive from 
the HJE a nonlinear ordinary differential equation 
first obtained by Derrida, Lebowitz, and Speer 
through a direct rather complex calculation. This 
equation implies the nonlocality of the entropy in the 
SNS of this model. The third model, the Kawasaki- 
Glauber dynamics, provides the illustration of two 
aspects. Nonlocality of the entropy, that is, long- 
range correlations, can appear in isolated equilibrium 
states if the microscopic dynamics is not time-reversal 
invariant. This means that long-range correlations as 
a signature of time-reversal violation are not 
restricted to SNSs. The second aspect to be under- 
lined is the effectiveness of the HJE in a more 
complex case: in fact in this model, the number of 
particles is not conserved which leads to a very 
complicated structure of the HJE. 

As a general comment, we emphasize that 
dynamics microscopically different but leading to 
the same macroscopic description, in particular the 
same hydrodynamics and large deviation functional, 
are indistinguishable for the theory which is purely 
macroscopic. 


Zero Range 


We consider the so-called zero-range process 
which models a nonlinear diffusion of a lattice 
gas. The model is described by a positive integer 
variable 5.(x) representing the number of particles 
at site x and time 7 of a finite lattice which for 
simplicity we assume one dimensional. The parti- 
cles jump with rates g(7(x)) to one of the nearest- 
neighbor sites x 4-1, x —1 with probability 1/2. 
The function g(k) is nondecreasing and g(0)=0. 
We assume that our system interacts with two 
reservoirs of particles in positions N and —N with 
rates p. and p_, respectively. This model can be 


solved exactly and the previous theory can be 
checked in full detail. 

Let us introduce the macroscopic coordinates, 
time t=7T/N? and space u=x/N. To describe the 
macroscopic dynamics, we introduce the empirical 


density 


N 
px(t.u) - x. D> mye(x)o(u—x/N) 42) 
x=—N 


where ó(u — x/N) is the Dirac 6. One can prove that in 
the limit N — oc, the empirical density [42] tends in 
probability to a continuous function p(t), which 
satisfies the following hydrodynamic equation: 


öp = 5A¢(p) = D(p) [43] 


where ó(p) can be explicitly defined in terms of the 
rates g(7). The boundary conditions for [43] are 
The adjoint hydrodynamics is 


Op = ; Aó(o) — av Se = D* (p) 44) 
with 
and 
es ERO P 
2 


The boundary conditions for [44] are the same as 
for [43]. The second term on the right-hand side of 
[44] is proportional to the difference of the chemical 
potentials and produces an inversion of the particle 
flux. The action functionals /(5) and J*(ĝ) for this 
model have been computed and have the form [18] 
and [22], respectively, with x(o) — ó(p). The entropy 
S(p) can be easily computed directly from the 
expression of the invariant measure which is of 
product type and is known explicitly: 


ó(p(u)) 
A(u) 


1 
S(p) = i du [p log 


| 


[45] 


where 


N 
S 
|l 
E 
Me 
9 


l g(1):-- 


It is easy to verify that it solves the HJE. Due to the 
special zero-range character of the interaction in this 
model, there are no long-range correlations in 
nonequilibrium states. 
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Simple Exclusion 


The simple exclusion process is a model of a lattice 
gas with an exclusion principle: a particle can move 
to a neighboring site, with rate 1/2 for each side, 
only if this is empty. We consider again a one- 
dimensional case and we denote by 7,(7) € (0, 1} the 
number of particles at the site x at (microscopic) 
time 7. The system is in contact with particle 
reservoirs at the boundaries + N where a particle is 
created with rates p. if the boundary site is empty 
and is destroyed 1 — p. if it is occupied. In contrast 
to the zero-range model, the invariant measure 
carries long-range correlations making the entropy 
nonlocal. 

The hydrodynamic equation for the simple exclu- 
sion process can be derived as for the zero-range 
process; in fact, it is easier in this case because a 
simple computation leads directly to a closed 
equation for the empirical density which is defined 
as in [42] except that the variable 7 now takes only 
the values 0 or 1. We find that the limiting density 
evolves according to the linear heat equation 


O,p(t, u) = 3Ap(t. u) = D(p) [46] 


with boundary conditions 


p(t, +1) =- m aji 


In this case, the density of particles p takes values 
in [0,1]. We use the HJE to calculate the entropy. 
For this model, we have x(p) — p(1 — p). We show 
that the solution of the HJE for S(p) (which is a 
functional derivative equation) can be reduced to the 
solution of an ordinary differential equation. 

The Hamilton-Jacobi equation for the simple 
exclusion process is 


óS óS óS 
(ve p vi) «6 ) |47] 
We look for a solution of the form 


óS = plu) i 
óp(u) 1 — p(u) 


ó(u: p) [48] 


for some functional ó(u; p) to be determined satisfy- 
ing the boundary conditions 


ọ(+1) = log pa 
l — ps 
in the space variable. The first term on the right- 
hand side is the derivative of the equilibrium 
entropy, that is for boundary conditions p- = p4. 
Inserting [48] into [47], we get (note that 
p — e? /(1 +e?) vanishes at the boundary) 


|J ee (v (hoe; £ "2 2L — ove) 


= - (Vp, Và) + (91 - p), (VàY) 


e? 1 «3 
- (o uA [ + =) (o i 1+ s) ' (Vo) ; 
u e? (Voy E 3 
4 (> T+ z) (a0 tape AV9 ) ) 


We obtain a nontrivial solution of the Hamilton- 
Jacobi if we solve the following ordinary differential 
equation, corresponding to the vanishing of the right 
side of the scalar product, which relates the 
functional ó(u) = d(u; p) to p: 


Adó(u) ] 
[Va Teer PM BEL 49] 
g(+1) = log — 2 

1 — px 


It is clear that # is a nonlocal functional of p. A 
computation shows that the derivative of the 
functional 


S(p) = / duf plogp + (1 — p)log(1 — p) 


(1 — p)o — log(1 + e^) + log d 
Vp 


is given by [48] when ó(; p) solves [49]. 


Kawasaki-Glauber Dynamics 


The model consists of particles on a lattice evolving 
according to two basic dynamical processes: 


l. a particle can move to a neighboring site if this is 
empty as in the simple exclusion and 

2. a particle can disappear in an occupied site or be 
created if this is empty, the rate depending on the 
nearby configuration. 


The first process is conservative while the second is 
not. 

As before the object of our study is the empirical 
density [42]. It is possible to show that as N goes to 
infinity, p(t, 4) is a solution of 


Ó,p = 3p + B(p) — D(p) [50] 

with 
B(p) = E, (c(n)(1 — n(0))) [51] 
D(p) = E, (c(n)n(0)) [52] 
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where v, is the Bernoulli product distribution with 
parameter p. Typically, B(p) and D(p) are poly- 
nomials in p. For this model we consider equilibrium 
states so that we can take periodic boundary 
conditions. An equilibrium state corresponds to a 
density p which is the solution of the equation 
B(p)— D(p) and gives a minimum of the potential 
V(p) — [^[D(p) — B(p)]dp'. We admit potentials 
with several minima. The Hamiltonian associated 
to the large deviation functional for this model is not 
quadratic: 


(pH) = | du} 5 HAp (VH)'p(1 — p) 
— B(p)(1 — exp H) 


TE exp(-H)) 53] 


— D(p) 


where H has the role of the conjugate momentum. 
The Hamilton-Jacobi equation 


(ero 


is therefore very complicated but can be solved by 
successive approximations using as an expansion 
parameter p — p, where p is a solution of B(p) = D(p) 
that is a stationary solution of hydrodynamics. For 
p=p, we have 68/6p—0. We are looking for an 
approximate solution of [54] of the form 


S(p) =5/ du Àj dv(p(u) — piken oie) — p) 
+ o(p — p) [55] 


The kernel k(u,v) is the inverse of the density 
correlation function c(u, v). 


J c(u, y)k(y,v) dy = 6(u — v) 56] 


By inserting [55] in [54], one can show that k(u, v) 
satisfies the following equation: 


+p(1 — p)A,R(u, v) 
5A,.6(u — V) 


= bok(u.v) 
+ (dı —b,)6(u—v)=0 [57] 


where 


bı = B'(p)| 


p—p' 


and 


If the entropy is a local functional of the density, 


k(u,v) must be of the form k(u,v) —f(p)ó(u — v) 
which inserted in [57] gives 
f(p) = p(1—p)] [59] 
and 
bo|p(1—p)) — (di — 61) = 0 (60) 


Therefore if 6o9,b;,d,; do not satisfy the last 
equation, the entropy cannot be a local functional 
of the density. It can be shown that in this case time- 
reversal invariance is violated and the adjoint 
hydrodynamics is different from [50]. This calcula- 
tion supports the conjecture that macroscopic 
correlations are a generic feature of equilibrium 
states of nonreversible lattice gases. 


See also: Interacting Particle Systems and 
Hydrodynamic Equations; Interacting Stochastic Particle 
Systems; Nonequilibrium Statistical Mechanics 
(Stationary): Overview; Quantum Central-Limit Theorems. 
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Introduction 


Nuclear magnetic resonance (NMR) is a subtle 
quantum-mechanical phenomenon that, through 
magnetic resonance imaging (MRI), has played a 
major role in the revolution in medical imaging over 
the last 30 years. Before being conceived for use in 
imaging, NMR was employed by chemists to do 
spectroscopy, and it remains a very important tech- 
nique for determining the structure of complex 
chemical compounds like proteins. In this article we 
explain how NMR is used to create an image of a 
three-dimensional object. Scant attention is paid to 
both NMR spectroscopy, and the quantum descrip- 
tion of NMR. Those seeking a more complete 
introduction to these subjects should consult the 
article Nuclear Magnetic Resonance in this Encyclo- 
pedia, as well as the monographs of Abragam (1983) 
or Ernst et al. (1987), for spectroscopy, and that of 
Callaghan (1993) for imaging. All three books 
consider the quantum-mechanical description of 
these phenomena. Comprehensive discussions of 
MRI can be found in Bernstein et al. (2004) and 
Haacke et al. (1999), and a historical appreciation of 
the development of MRI is given in Wehrli (1995). 


The Bloch Equation 


We begin with the Bloch phenomenological equa- 
tion, which provides a model for the interactions 
between applied magnetic fields and the nuclear 
spins in the objects under consideration. This is a 
macroscopic averaged model that describes the 
interaction of aggregates of spins, called isochro- 
mats, with applied magnetic fields. An isochromat is 
a collection of “like” spins, which is spatially large 
on the atomic scale, but very small on the scale of 
the variations present in the applied magnetic fields. 
Spins are alike if they belong to the same species and 
are in the same chemical environment. There may be 
several different classes of spins, but, in this article, 
it is assumed that they are noninteracting and so it 
suffices to consider each separately. Heretofore, we 
suppose that there is a single class of like spins. The 
distribution of isochromats for these spins is 
described macroscopically by the spin density 
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function, which we denote by p(x,y,z). In most 
medical applications, one is imaging the distribution 
of spins arising from hydrogen protons in water 
molecules. 

The state of the isochromat at spatial location 
(x, y, z) is given by a 3-vector: 


M(x, y, z) = (m1 (x, y, z), m2(x, y, z), m3(x,y,z)) 


which is interpreted as the magnetic moment per 
unit volume. It is an ensemble mean of the quantum 
dipoles caused by the spins within the isochromat. In 
most applications of NMR to imaging, the applied 
magnetic field is described as the sum of a large, 
time-independent field, Bo(x, y, z), and smaller time- 
dependent fields, B'(x,y,2;t). In the presence of a 
static field, thermal fluctuations cause the nuclear 
spins to slightly prefer an orientation aligned with 
the field. Using the Boltzmann distribution, one 
obtains that the nuclear paramagnetic susceptibility 
of water protons is given by 


p^ 
- ak. 1] 
4kgT 
here 5b is Planck's constant, kg the Boltzmann’s 
constant, and T the absolute temperature, (see Levitt 


(2001)). The constant y is called the gyromagnetic 
(or magnetogyric) ratio. For a proton, 


X 


y & 2r x 42.5764 x 105 rad s! T^! [2] 


For water molecules at 
x = 3.6 x 107”. 

If the sample is held stationary in the field Bọ for a 
sufficiently long time, then the spins become 
polarized and a bulk magnetic moment appears; 
this is called the equilibrium magnetization: 


room temperature, 


Mo(x, y, z) = xp(x, y, z)Bo(x, y, z) [3] 


The Bloch equation describes the evolution of M 
under the influence of the applied field B= Bo + B’: 


dM(x, y, z;t | 
ent) — yM(x, y, 2;t) x B(x, y, z;t) 


i. 1 
-= K 2 —- (Mo(x, y. 
7M (x,y,z t) +a o(x, y, z) 


- M! (x, y, z; t)) [4] 


Here x is the vector cross-product, M" (x,y,z; t) 
the component of M(x,y,z;t) perpendicular to 
Bo(x,y,z) (called the transverse component), and 
M! the component of M parallel to Bo (called the 
longitudinal component). For hydrogen protons in 
other molecules, the gyromagnetic ratio is expressed 
in the form (1 — o)». The coefficient c is called the 
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nuclear shielding; it is typically between —10 ^ and 
+1074. The difference in the nuclear shielding causes 
a shift in the resonance frequency by yo. 

The second and third terms in eqn [4] are 
relaxation terms. They provide a phenomenologi- 
cal model for the averaged interactions of the spins 
with one another and their environment. The 
coefficient 1/T|(x, y, z) is the spin lattice relaxation 
rate; it describes the rate at which the magnetiza- 
tion returns to equilibrium. The coefficient 
1l/To»(x,y,z) is the spin-spin relaxation rate; it 
describes the rate at which the transverse compo- 
nents of M decay. The physical processes causing 
these relaxation phenomena are different and so 
are the rates themselves, with T» less than Tı. The 
relaxation rates largely depend on the localized 
thermal fluctuations of the molecules and provide 
a useful contrast mechanism in MR imaging. 
Spin-spin relaxation occurs very rapidly in solids 
(«1 ms) and, therefore, we usually assume that we 
are imaging liquid-like materials such as water 
protons in soft mammalian tissues. In this case, T» 
takes values in the 40 ms to 4s range. Notice that 
this model does not include any explicit interac- 
tion between isochromats at different spatial 
locations. A variety of such interactions exist, 
but, at least in liquid-like materials, they lead only 
to small corrections in the Bloch equation model. 
A derivation of the Bloch equation from the 
Schródinger equation can be found in Abragam 
(1983) and Slichter (1990). For coupled systems, 
the Bloch equation formalism breaks down and a 
full quantum-mechanical treatment is necessary 
(see Nuclear Magnetic Resonance and Ernst et al. 
(1983)). 

Much of the analysis in NMR imaging amounts to 
understanding the behavior of solutions to eqn [4] 
with different choices of B. We now consider some 
important special cases. The simplest case occurs if 
B has no time-dependent component; then this 
equation predicts that the sample becomes polarized 
with the transverse part of M decaying as e ‘/”?, 
and the longitudinal component approaching the 
equilibrium magnetization, Mo, as 1—e "', To 
simplify the subsequent discussion, we assume that 
the field By is homogeneous with Bo = (0,0, bo). If 
B—B, and we omit the relaxation terms (set 
Ti; — T; —oo in [4]), then an initial magnetization 
M(x,y,2;0) simply precesses about Bo at angular 
frequency wo =7ybo: M(x, y, z;t) = U(t) M(x, y, z; 0), 
with 


coswot —sinwot O0 
U(t)= | sinwot coswot 0 [5] 
0 0 ] 


The frequency wọ is called the Larmor frequency; 
this precession of M about the axis of Bo is the 
resonance phenomenon referred to as NMR. In 
typical medical imaging systems, bp is between 1 
and 3T and the corresponding resonance frequency 
is between 40 and 120 MHz. 

Typically, the field B takes the form 


B= By + G 4- Bi [6] 


where G is a gradient field and B, is a radio- 
frequency (RF) field. Usually, the gradient fields are 
“piecewise time-independent” fields, small relative 
to By. By piecewise time-independent field, we 
mean a collection of static fields that, in the course 
of the experiment, are turned on and off. The Bj 
component is a time-dependent RF field, nominally 
at right angles to Bo. It is usually taken to be 
spatially homogeneous, with time dependence of 
the form 


a(t) 
Bi(t) = U(t) | Dt) |7] 
0 


The functions a and / define an envelope that 
modulates the  time-harmonic field, | coswof, 
sinwot,0|. They are supported in a finite interval 
[25, t |, that is, the B, field is “turned on” for a finite 
period of time. The change in the state of the 
magnetization between tj) and £4 is called the RF 
excitation. It may be spatially dependent. 

In light of [5] it is convenient to introduce the 
rotating reference frame. We replace M with m, 
where m(x, y, 2;1) = U(t) M(x, y, 2; t). It is a classi- 
cal result of Larmor, that if M satisfies [4], then m 
satisfies 


d ET: 
RORIS = ym(x, y, z; t) X B glx, y. Zs t) 
T d | l 
E. S 063,5 0) Hr Mole yz) 
— ml (x,y,z: t)) [8] 
where 


Bag = Ut) B= (o. o2) 


As G is much smaller than B and quasistatic, it turns 
out that one can ignore the components of G 
orthogonal to Bo. Indeed, in imaging applications, 
one usually assumes that the components of G 
depend linearly on (x,y,z) with the z-component 
given by ((x,y,z),(g1,g2,23)). The constant vector 
G = (g1, 22,23) is called the gradient vector. With 
Bo =(0,0,b9) and Bı given by [7], we see that Berg 
can be taken to equal (0, 0, ((x, y, z), G)) + (a, 8,0). 


In the remainder of this article, we assume that B, 
takes this form. 

If G — 0 and 8 = 0, then the solution operator for 
Bloch's equation, without relaxation terms, is 


1 0 0 
V(t) - |0 cos@(t)  sinO(t) [9] 
0 —sin@(t) cosÓ(t) 
where 
üt) = fi a(s) ds [10] 


This is simply a rotation about the x-axis through 
the angle 0(t). If Bı AO for £€[0,7], then the 
magnetization is rotated through the angle 6(7). 
Thus, RF excitation can be used to move the 
magnetization out of its equilibrium state. As we 
shall soon see, this is crucial for obtaining a 
measurable signal. Note that the equilibrium mag- 
netization is a tiny perturbation of the very large 
field By and is, therefore, in practice not directly 
measurable. Only the precessional motion of the 
transverse components of M produces a measurable 
signal. More general Bı fields, that is, with both a 
and /3 nonzero, have more complicated effects on the 
magnetization. In general, the angle between M and 
Mo at the conclusion of the RF excitation is called 
the flip angle. 

If, on the other hand, B; —0 and G,;=(0,0, 
l(x, y, z)), where /(-) is a function, then V depends on 
(x, y, z), and is given by 


Vx, y, Sf) 


cos yl(x,y,z)t —sinyl(x,y,z)t 0 


[11] 


= | sinyl(x,y,z)t  cosyl(x,y,z) 0 
0 0 1 


[his is precession about Bo at an angular 
frequency that depends on ‘the local field strength 
bo + l(x,y,z). If both Bı and G are simultaneously 
nonzero, then, starting from equilibrium, the 
solution of the Bloch equation, at the conclusion 
of the RF pulse, has a nontrivial spatial depen- 
dence. In other words, the flip angle becomes a 
function of the spatial variables. We return to this 
in a later section. 


A Basic Imaging Experiment 


With these preliminaries, we can describe the basic 
measurements in magnetic resonance imaging. When 
exposed to Bo, the sample becomes polarized at a 
rate determined by Tı. Once the sample is polarized, 
a B,-field, of the form given in [7] (with 8 = 0), is 
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turned on for a finite time 7. This is called an RF 
excitation. For the purposes of this discussion, we 
suppose that the time is chosen so that 0(7) = 90°, see 
eqn [10]. As By and B, are spatially homogeneous, 
the magnetization vectors within the object remain 
parallel throughout the RF excitation. At the conclu- 
sion of the RF excitation, M is orthogonal to Bo. 

After the RF is turned off, the vector field 
M(x, y,z;t) precesses about Bo, in phase with the 
angular velocity wo. The transverse component of M 
decays exponentially. If we normalize the time so 
that £ — O0 corresponds to the conclusion of the RF 
pulse, then, in the laboratory frame, 


WO AX, y. Z - 
Mx, y, z; £) = Xwop(x, y, z) le 1/1? cos wot, 
Y 


e/T sin wot, (1 — e™/T | 12] 
Recall Faraday's law: a changing magnetic field 


induces an electromotive force (EMF) in a loop of 
wire according to the relation 


EMFioop ox 


d$,,;, 

ir [13] 
Here ®j,,,, denotes the flux of the field through the 
loop of wire (see Introductory Articles: Electromag- 
netism). The transverse components of M are a 
rapidly varying magnetic field, which, according to 
Faraday's law, induce a current in a loop of wire. In 
fact, by placing several such loops close to the sample 
we can measure a signal of the form 


2 Alwot > 
Xtwg€ p 
Y J sample 


X birec(x, y, z)dx dy dz [14] 


Here bij&«(x,y,z) quantifies the sensitivity of the 
detector to the precessing magnetization located at 
(x,y,z). From So(t) we easily obtain a measurement 
of the integral of the function pb;,,. By using a 
carefully designed detector, birec can be taken to be 
a constant, and therefore we can determine the total 
spin density within the object of interest. For the rest 
of this article, we assume that bire: is a constant. 
Note that the size of the measured signal is 
proportional to w%, which is, in turn, proportional 
to ||Bo||”. This explains, in part, why it is so useful to 
have a very strong Bo-field. Though even with a 
1.5 T magnet, the measured signal is only in the 
microwatt range (see Hoult and Lauterbur (1979) 
and Edelstein et al. (2004)). 

Suppose that, at the end of the RF excitation, we 
turn on the gradient G. As the magnetic field 
B= B, + G now has a nontrivial spatial dependence, 
the precessional frequency of the spins, which equals 
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y||B||, also has a spatial dependence. In fact, 

assuming that T» is spatially independent, it follows 

from [11] that the measured signal would now be 

given by 

NOT XU 
y 


x J pa p eod de dydz [43] 


ample 


Sg(t 


Up to a constant, e ™ e '/!2S¢(t) is simply the 
Fourier transform of p at k= — tyG/2z. By sam- 
pling in time and using a variety of different gradient 
vectors, we can sample the three-dimensional Fourier 
transform of p in a neighborhood of 0. This suffices 
to reconstruct an approximation to p. In medical 
applications, T5 is spatially dependent, which, as 
described later in the section *Contrast and resolu- 
tion," provides a useful contrast mechanism. 

Imagine that we collect samples of p(k) on a 
rectangular grid 


l (A Rs, jyAky, jzAk;): 
NX e Ny Ny : N., 


—€— À uL eet md eS 
x cm Sy tx IS RI 
N.. N, 

e a E qe Im 
5 cM. } 


Since we are sampling in the Fourier domain, the 
Nyquist sampling theorem implies that the sample 
spacing determines the spatial field of view from which 
we can reconstruct an artifact-free image: in order to 
avoid aliasing artifacts, the support of p must lie in a 
rectangular region with side lengths [Ak,", Ak, k 
Ak-'], see Haacke et al. (1999), Epstein (2003), and 
Barrett and Myers (2004). In typical medical applica- 
tions, the support of p is much larger in one dimension 
than the others, and so it turns out to be impractical to 
use the simple data collection technique described 
above. Instead, the RF excitation takes place in the 
presence of nontrivial gradient fields, which allows for 
a spatially selective excitation: the magnetization in 
one region of space obtains a transverse component, 
while that in the complementary region is left in the 
equilibrium state. In this way, we can collect data from 
an essentially two-dimensional slice. This is described 
in the next section. 


Selective Excitation 


As remarked above, practical imaging techniques do 
not excite all the spins in an object and directly 
measure samples of the three-dimensional Fourier 
transform. Rather, the spins lying in a slice are 


excited and samples of the two-dimensional Fourier 
transform are then measured. This process is called 
selective excitation and may be accomplished by 
applying the RF excitation with a gradient field 
turned on. With this arrangement, the strength of 
the static field, Bo + G, varies with spatial position, 
hence the response to the RF excitation does as 
well. Suppose that G= (0,0, ((x,y,z),G)) and set 
f — [2x] yl (x, y,z),G). This is called the offset 
frequency, as it is the amount by which the local 
resonance frequency differs from the resonance 
frequency wọ of the Bo-field. The result of a selective 
RF excitation is described by a magnetization profile 
mP'(f), which is a unit 3-vector-valued function of 
the offset frequency. A typical case would be 


0. 0, 1] for f £ [fofi] 
Isin0,0,cos0] for f € [fo. fi] 


The magnetization is flipped through an angle 6, in 
regions of space where the offset frequency lies in 
the interval | fo, fi] and is left in the equilibrium state 
otherwise. 

Typically, the excitation step takes a few milli- 
seconds and is much shorter than either T4, or T>; 
therefore, one generally uses the Bloch equation, 
without relaxation, in the discussion of selective 
excitation. In the rotating reference frame, the Bloch 
equation, without relaxation, takes the form 


l 0 Anf | —«08 
——— —|-2«f 0 "ya 
yB —ya 0 


mif -1 16| 


m(f;t)  |17| 


The problem of designing a selective pulse is 
nonlinear. Indeed, the selective excitation problem 
can be rephrased as a classical inverse-scattering 
problem: one seeks a function a(t) +iG(t) with 
support in an interval [£o,7;] so that, if z(f;t) is 
the solution to (17) with m/(f;t))=[0,0,1], then 
m(f;ti)— mP'(f). If one restricts attention to flip 
angles close to 0, then there is a simple linear model 
that can be used to find approximate solutions. 

If the flip angle is close to zero, then m3 1 
throughout the excitation. Using this approxima- 
tion, we derive the low-flip-angle approximation to 
the Bloch equation, without relaxation: 


d(mı + 1m) 
dt 


From this approximation, we see that 


= —2nif (mi + im2) tia +i) [18] 


F (më + im?) (t 
a(t) + 10(t) e im] + im JUD 


yi 
where .7(b)(t) = f "er? df [19] 
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Figure 1 A selective 90° pulse and profile designed using the 


magnetization profile produced by the pulse in (a). 


magnetization 
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linear approximation. (a) Profile of a 90° sinc-pulse. (b) The 


KHz 


(b) 


Figure 2 A selective 90° pulse and profile designed using the inverse scattering approach. (a) Profile of a 90° inverse-scattering 


pulse. (b) The magnetization profile produced by the pulse in (a). 


For an example such as in [16], 0 close to zero, and 
fo — —fi, we obtain 


isin Ó sin fit 


ENTIS 20] 


myt 

A pulse of this sort is called a sinc-pulse. A 
sinc-pulse is shown in Figure 1a, the result of 
applying it in Figure 1b. A more accurate pulse can 
be designed using the Shinnar-Le Roux algorithm 
(see Pauly ez al. (1991) and Shinnar and Leigh 
(1989)), or the inverse scattering approach (see 
Epstein (2004)). An inverse-scattering 90°-pulse is 
shown in Figure 2a and the response in Figure 2b. 


Spin-Warp Imaging 


In an earlier section we showed how NMR 
measurements could be used to measure the three- 


dimensional Fourier transform of p. In this section, 
we consider a more practical technique, that of 
measuring the two-dimensional Fourier transform of 
a "slice" of p. Applying a selective RF pulse, as 
described in the previous section, we can flip the 
magnetization in a region of space Zo — ÂZ <z < 
zo Az, while leaving it in the equilibrium state 
outside a slightly larger region. Observing that a 
signal near the resonance frequency is only produced 
by isochromats whose magnetization has a nonzero 
transverse component, we can now measure samples 
of the two-dimensional Fourier transform of the 
function 


1 Zo- Azo 
Dr (x.y) =—— o(x, y, 21 
Pa ,y) X J p(x,y,z)dz [21] 


20— Az 


If Az is sufficiently small then p; (x, y) = p(x, y, zo). 
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In order to be able to use the fast Fourier 
transform (FFT) algorithm to do the reconstruction, 
it is very useful to sample p,, on a uniform grid. To 
that end, we use the gradient fields as follows: after 
the RF excitation we apply a gradient field of the 
form Gy, = (0,0, —g»y + gix) for a certain period of 
time T,y,. This is called a phase encoding gradient. 
At the conclusion of the phase encoding gradient, 
the transverse components of the magnetization 
from the excited spins has the form 


ml (x,y) o eikka (xy)  — 2] 


where (kx, Ry) = [27] "Ta (—81, 22). At time Ty, 
we turn off the y-component of Gph and reverse the 
polarity of the x-component. At this point, we begin 
to measure the signal. We get samples of p(k, ky) 
where k varies from —kx max to Rymax. By repeating 
this process with the strength of the y-phase 
encoding gradient being stepped through a sequence 
of uniformly spaced values, g) € (nAg,], and col- 
lecting samples at a uniformly spaced set of times, 
we collect the set of samples 


D (mA Rs. n AE, ): 


Nx N. N, N, 23 


35 


The gradient Gg = (0,0, —gıx), left “on” during 
signal acquisition, is called a frequency encoding 
gradient. While there is no difference, mathemati- 
cally, between the phase encoding and frequency 
encoding steps, there are significant practical differ- 
ences. This approach to sampling is known as spin- 
warp imaging; it was introduced in Edelstein et al. 
(1980). The steps of this experiment are summarized 
in a pulse sequence timing diagram, shown in 
Figure 3. This graphical representation for the 
steps followed in a magnetic resonance imaging 
experiment is ubiquitous in the literature. 

To avoid aliasing artifacts, the sample spacings 
Ak, and Ak, must be chosen so that the excited 
portion of the sample is contained in a region of size 
Ak. x Aky'. This is called the field of view or 
FOV. Since we can only collect the signal for a finite 
period of time, the Fourier transform p(Rx, ky) is 
sampled at frequencies lying in a rectangle with 
vertices (Ry max; Ry max) where 


N, Ak, 


NAR 
NAR ky max — 2 [24] 


ky max — 
2 


The maximum frequencies sampled effectively deter- 
mine the resolution available in the reconstructed 
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Figure 3 Pulse timing diagram for spin-warp imaging. During 
the positive lobe of the frequency encoding gradient, the analog- 
to-digital converter (ADC) collects samples of the signal 
produced by the rotating transverse magnetization. 


image. Heuristically, this resolution limit equals half 
the shortest measured wavelength: 


] FOV, 

"aeu NR » 
1 FOV, 

A = 


di 2 max Ny 


Whether one can actually resolve objects of this size in 
the reconstructed image depends on other factors such 
as the available contrast and the signal-to-noise ratio 
(SNR). We consider these factors in the final sections. 


Signal-to-Noise Ratio 


At a given spatial resolution, image quality is largely 
determined by SNR and the contrast between the 
different materials making up the imaging object. SNR 
in MRI is defined as the voxel signal amplitude divided 
by the noise standard deviation. The noise in the NMR 
signal, in general, is Gaussian distributed with zero 
mean. Ignoring contributions from quantization, for 
example, due to limitations of the analog-to-digital 
converter, the noise voltage of the signal can be 
ascribed to random thermal fluctuations in the receive 
circuit (see Edelstein (1986)). The variance is given by 


P cid = 4kg TRAV [26] 


where kp is Boltzmann’s constant, T the absolute 
temperature, R the effective resistance (resulting from 
both receive coil, Re and object, Ro), and Av the 
receive bandwidth. Both R; and R, are frequency 
dependent, with R. o w!/, and Ro œx w. Their relative 
contributions to overall circuit resistance depend in 
a complicated manner on coil geometry, and 
the imaging object’s shape, size, and conductivity 


(see Chen and Hoult (1989)). Hence, at high magnetic 
field, and for large objects, as in most medical 
applications, the resistance from the object dominates 
and the noise scales linearly with frequency. Since the 
signal is proportional to w, in MRI, the SNR increases 
in proportion to the field strength. 

As the reconstructed image is complex valued, it is 
customary to display the magnitude rather than the 
real component. This, however, has some conse- 
quences on the noise properties. In regions where the 
signal is much larger than the noise, the Gaussian 
approximation is valid. However, in regions where the 
signal is low, rectification causes the noise to assume a 
Raleigh distribution. Mean and standard deviation can 
be calculated from the joint probability distribution: 


l 2, N2) 2a 

TE e UN *N)/2a* [27] 
where N, and N; are the noise in the real and 
imaginary channels, respectively. When the signal is 
large compared to noise, one finds that the variance 
aż, — o^. In the other extreme of nearly zero signal, 
one obtains for the mean: 


S— o /2/2 & 1.2530 [28] 


and, for the variance: 


P(N,, Ni) = 


a^, = 20^ (1 — 7/4) & 0.6550" [29] 


Of particular practical significance is the SNR 
dependence on the imaging parameters. The voxel 
noise variance is reduced by the total number of 
samples collected during the data acquisition pro- 
cess, that is, 


07, I Thermal /N [30] 


where N=N, N, in a two-dimensional spin-warp 
experiment. Incorporating the contributions to 
thermal noise variance, other than bandwidth, into 
a constant 


we obtain for the noise variance: 


uv 


= 2 
m= NEN, Nave [32] 


Here Nayg is the number of signal averages collected 
at each phase encoding step. We obtain a simple 
formula for SNR per voxel of volume AV: 


NINN uo 
SNR — Co AV V———— 
uAv 


NxNyNavg 
uAv 


= Cp Ax Ay d. 
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(a) (b) 


Figure 4 /,-weighted sagittal images through the midline of 
the brain: Image (b) has twice the SNR of image (a), showing 
improved conspicuity of small anatomic and low-contrast detail. 
The two images were acquired at 1.5T field strength using two- 
dimensional spin-warp acquisition and identical scan para- 
meters, except for Nay4, which was 1 in (a) and 4 in (b). 


where Ax, Ay are defined in [25], d, is the thickness 
of the slab selected by the slice-selective RF pulse, 
and p denotes the spin density weighted by effects 
determined by the (spatially varying) relaxation 
times T; and T» and the pulse sequence timing 
parameters. Figure 4 shows two images of the 
human brain obtained from the same anatomic 
location but differing in SNR. 


Contrast and Resolution 


The single most distinctive feature of MRI is its 
extraordinarily large innate contrast. For two soft 
tissues, it can be on the order of several hundred 
percent. By comparison, contrast in X-ray imaging is 
a consequence of differences in the attenuation 
coefficients for two adjacent structures and is 
typically on the order of a few percent. 

We have seen in the preceding sections that the 
physical principles underlying MRI are radically 
different from those of X-ray computed tomogra- 
phy, in that the signal elicited is generated by the 
spins themselves in response to an external pertur- 
bation. The contrast between two regions, A and B, 
with signals $4 and Sg, respectively, is defined as 


SA — Sp 
SA 


If the only contrast mechanism were differences in 
the proton spin density of various tissues, then 
contrast would be on the order of 5—20%. In reality, 
it can be several hundred percent. The reason for 
this discrepancy is that the MR signal is acquired 
under nonequilibrium conditions. At the time of 
excitation, the spins have typically not recovered 
from the effect of the previous cycle’s RF pulses, nor 


Cap = [34] 
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is the signal usually detected immediately after its 
creation. 

Typically, in spin-warp imaging, a spin-echo is 
detected as a means to alleviate spin coherence 
losses from static field inhomogeneity. A spin-echo 
is the result of applying an RF pulse that has the 
effect of taking (m1, m2, m3) to (mi, —m2, —ma). As 
such a pulse effects a 180° rotation of the Z-axis, it is 
also called a 7-pulse. If, after such a pulse, the spins 
continue to evolve in the same environment then, 
following a certain period of time, the transverse 
components of the magnetization vectors through- 
out the sample become aligned. Hence a pulse of 
this type is also called a refocusing pulse. The time 
when all the transverse components are rephased is 
called the echo time, Ty. 

The spin-echo signal amplitude for an RF pulse 
sequence 7/2— 7T— 7 — T, repeated every Tk sec- 
onds, is approximately given by 


S(t — 27) = p(1 =e 1e t6 [35] 


This is a good approximation as long as Tr << Tg 
and T» << Tg, in which case the transverse magne- 
tization decays essentially to zero between successive 
pulse sequence cycles. In eqn [35], p is voxel spin 
density and the echo time Ty —27. Empirically, it 
is known that tissues differ in at least one of 
the intrinsic quantities, Tı, T2, or p. It, therefore, 
suffices to acquire images in such a manner that 
contrast is sensitive to one particular parameter. For 
example, a *T5-weighted" image would be acquired 
with Tr ^ T; and Tg >> Tı and, similarly, a 
*Ti-weighted" image with Tr < T, and Ty << T», 
with Ti, T» representing typical tissue proton relaxa- 
tion times. Figure 5 shows two images obtained with 
the same scan parameters except for Tg and Ty 
illustrating the fundamentally different image con- 
trasts that are achievable. 

It is noteworthy that object visibility is not just 
determined by the contrast between adjacent 


(a) (b) 
Figure 5 Dependence of image contrast on pulse sequence 
timing parameters: (a) T;-weighted; (b) proton density-weighted. 


structures but is also a function of the noise. It is, 
therefore, useful to define the contrast-to-noise ratio as 
CNRAs = Cap [36] 
O eff 
where ce is the effective standard deviation of the 
signal. Finally, it may be useful to reconstruct 
parametric images in which the pixel signal values 
represent any one of the intrinsic parameters. A 
T)-image can be computed from eqn [35], for 
example, either analytically from two image data 
sets acquired with two different echo times, or from a 
series of Ty values, obtained from a Carr—Purcell spin- 
echo train, using regression techniques (see Nuclear 
Magnetic Resonance and Haacke et al. (1999)). 

We have previously, shown that the limiting 
resolution is given by kmax, the largest spatial 
frequency sampled, see [25]. In reality, however, 
the actual resolution is always lower. For example, 
spin-spin (T5) relaxation causes the signal to decay 
during the acquisition. In spin-warp imaging, this 
causes the high spatial frequencies to be further 
attenuated. 

A further consequence of finite sampling is a 
ringing or Gibbs artifact that is most prominent at 
sharp intensity discontinuities. In practice, these 
artifacts are mitigated by applying an appropriate 
apodizing filter to the data. Figure 6 shows a portion 
of a brain image obtained at two different resolu- 
tions. In Figure 6b, the total k-space area covered 
was 16 times larger than for the acquisition of the 
image in a). Artifacts from finite. sampling. and 
blurring of fine detail such as cortical blood vessels 
are clearly visible in the low-resolution image. SNR, 
according to eqn [33], is reduced in the latter image 
by a factor of 4. 


(a) (b) 


Figure 6 Effect of k-space coverage on spatial resolution in 
axial image of the brain: the field of view in both images was 
20cm and all scan parameters were the same except that (a) 
was acquired with Ny = N, = 128 and (b) with Nx = N; = 512. 


See also: Nuclear Magnetic Resonance; Stochastic 
Resonance. 
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The Basic Modeling 


Magnetohydrodynamics (MHD) is the study of the 
interaction of (electro-) magnetic fields and con- 
ducting fluids. When a conducting fluid (e.g., a 
liquid metal, a weakly ionized gas, or a plasma) is 
placed within a magnetic field, two coupling 
phenomena appear: the electric currents modify the 
magnetic field, and the Lorentz forces due to the 
magnetic field modify the motion of the fluid. At the 
mathematical level, two sets of equations, very 
different in nature, are involved. The usual descrip- 
tion of the hydrodynamics phenomena is most often 
that provided by the continuum mechanics for 
fluids, while the description of electromagnetic 
phenomena essentially proceeds from the Maxwell 
equations. 

Either category of equations can be declined in a 
variety of models. The coupling between the two 
categories might also be accounted for at different 
levels of accuracy. For the sake of conciseness in 
such an expository survey, it is neither desirable nor 
doable to present all the possible set of equations 
and their possible coupling. The difficulty stems 
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from the incredibly large spectrum of physical 
phenomena where MHD plays a role. A list of 
such phenomena includes 


è astrophysical and geophysical applications (mod- 
eling of stars in the galactic field, of pulsars, of 
solar spots, of the flows in the earth's core, ...), 

e advanced “terrestrial” applications such as the 
magnetic confinement of plasmas in controlled 
fusion, MHD propulsion engines for rockets, and 

e industrial applications in the engineering world 
(electromagnetic pumping, metal forming, alumi- 
num electrolysis, and many other metallurgical 
applications). 


Due to this variety of physical situations, no 
unified setting can be presented with a satisfactory 
degree of details. We therefore mostly concentrate 
throughout this article on the MHD of conducting 
fluids that are homogeneous, incompressible, vis- 
cous, and Newtonian. This is often the case of 
liquid metals in many industrial processes. The 
equations manipulated will first be given in their 
most general form and then immediately adapted to 
the above context. For other contexts, the modeling 
follows the same pattern, but other variants of the 
general equations must be employed. The biblio- 
graphy of this article contains such general 
information. 
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The Hydrodynamics Description 


The usual description for fluids follows from 
continuum mechanics. In this setting, the governing 
equation is the equation for the conservation of 
momentum 


O(pu) 
Ot 


+ div(pu & u) — divr=f [1] 


where p denotes the density of the fluid, 4 its 
velocity, 7 the stress tensor, and f the density of 
volumic (or per unit volume) body forces applied to 
the fluid. For incompressible viscous Newtonian 
fluids, the stress/velocity relation reads 


r —q)(Vu + (Vu)! ) — pld [2] 
together with the constraint 
divu — 0 [3] 


on the velocity. Here, 7) denotes the viscosity of the 
fluid, p the pressure, and A! denotes the transpose 
matrix of the matrix A. A third usual assumption is 
that the incompressible fluid is in addition homo- 
geneous, that is, 


p= p = constant [4| 


Equations [1]-[4] lead to the equations for 
conservation of momentum in the case of incom- 
pressible homogeneous viscous Newtonian fluid, 
that is, the incompressible Navier-Stokes equations 


Ou 


pa, + Pu: Nu —nAu-t Vp—f 


div 4 — 0 


i] 


These equations are supplied with initial and 
boundary conditions on the velocity s. At initial 
time, the velocity is assumed to be known 
u(t—0,)-ug on the whole domain occupied by 
the fluid Q, a domain that is supposed here not to 
vary in time (see, nevertheless, the section “The 
industrial production of aluminum" for a different 
setting). On the other hand, the boundary conditions 
on the boundary 0€) of €) can be of various forms. 
For simplicity, the boundary is supposed regular, so 
that its unitary outward normal 50 can be 
unambiguously defined. The standard choice is to 
set Dirichlet conditions on the velocity “= given. In 
the following, we will assume for simplicity that the 
boundary condition is the homogeneous Dirichlet 
boundary condition u =Q, as a superposition of the 
nonpenetration condition u -nan — 0 and the no-slip 
boundary condition 4x 5450-0. One can also 
impose alternative boundary conditions, for exam- 
ple, involving the pressure. 


The Electromagnetic Description 


Classical electromagnetism is described by the 
Maxwell equations. For the sake of consistency, we 
recall here that these are: 


The Maxwell-Ampere equation 


OD 
—3 + curlH = j [6] 


The Maxwell-Coulomb equation 


divD = pe [7] 


The Maxwell-Faraday equation 


OB 
ot T curl E: [8] 


The Maxwell-Gauss equation 
divB = 0 [9] 


In the above equations, the three-dimensional vector 
fields D, B, E, H denote the electric and magnetic 
inductions, and the electric and magnetic fields, 
respectively. On the other hand, the three-dimensional 
vector field j denotes the current density, and the scalar 
field pe denotes the charge density. Inside an elec- 
trically conducting medium, the standard assumption 
of perfect medium consists in assuming the following 
relations: 


DzsE 
10 
"- 10) 
L 


often called “constitutive laws,” where € and p, 
respectively, denote the (electric) permittivity and 
the (magnetic) permeability of the medium. In the 
simple isotropic homogeneous case, both these 
parameters are scalar and constant. They are often 
expressed as 


E€ = €+E0 


H = pir po 


[11] 


where £90,409 are the permittivity and the perme- 
ability of the vaccum (that satisfy copo = 1/c*, with 
c denoting the speed of light), and £, 44, are the 
permittivity and the permeability relative to vaccum, 
or relative permittivity and relative permeability. 
When collecting [6|-|9], together with [10], [11], 
one obtains the following general system of 


Maxwell equations in a continuum (dielectric) 
medium: 
O(cE) 1 . 
n zd R 
ey + curl C j 
div(cE) = pe 
(cE) = p 42 
e +: ead tif 
Ot 
divB = 0 


This system is supplied with initial conditions on the 
fields B and E. On the other hand, boundary 
conditions might be necessary when the equations 
are restricted to a bounded domain. The latter 
question, quite delicate, is postponed until next 
section. 


The MHD Coupling 


For coupling systems [5] and [12], a threefold task is 
in order. 

On the one hand, the body force term in [5] needs 
to be made precise, and this is completed by setting 


f= XB+ fea [13] 


The first term in the right-hand side is the Lorentz 
force, consequence of the electric current 7 running 
within the magnetic field B, a force that influences 
the motion, along the velocity field 4, of the 
particles of the conducting fluid. The second term 
is due to possible external forces. A typical case for 
such forces is that of the gravity forces 


PLI: [14] 


On the other hand, in order to be a mathemati- 
cally closed system, the Maxwell system [12] needs 
to be complemented by Ohm's law, another type of 
constitutive relation, like [10], that now relates the 
current density j with the other fields. When dealing 
with MHD phenomena, Ohm's law most often 
reads in the form 


j —c(E +u x B) 15] 


where o denotes the electric conductivity of the 
fluid. The second term of [15] explicitly accounts for 
the deviation of the lines of electric current by the 
hydrodynamics flow. In some oversimplified situa- 
tions, it can be neglected, leading to Ohm’s law in 
the more usual form j=cE, that is also valid for 
solid media. Most of the times the term ux B 
contains crucial information, and thus is not 
neglected. 
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System [5]-|[12] now reads 


a | 
po pu: Vu-uAu- Vp-jxBsf 
C 


ext 


div u = 0 
= = + curl C B) X 
E= p tS) 


IB 
c+ curl E — 0 
div B = 0 
J — o(E -- ux B) 


A third task is then in order. 

Apart from the constitutive laws [10] and Ohm's 
law [15], the specificity of the Maxwell equations for 
conducting fluids, as opposed to the same equations 
written, for example, in the vacuum, resides in the 
possible need for supplying the system with ad boc 
boundary conditions. Indeed, in their most general 
form, the Maxwell equations are valid in the whole 
physical space R^. On the other hand, as the goal here 
is to simulate an MHD fluid that most often occupies 
only a bounded domain Q in R?, there is the need to 
adequately define the simulation domain. 

A first possibility is to set the Maxwell equations 
in the whole space, while solving the hydrodynamics 
equation on the domain Q occupied by the fluid. 
Regarding only the Maxwell equations [12], this 
seems to be the method of choice. But then there is 
the need for an extension of Ohm's law |15] outside 
the fluid domain. Notice indeed that u appears in 
[15]. In addition to this, the fact that the physical 
confinement device for the fluid is then embedded in 
the domain where the Maxwell equations are set 
may be the source of various difficulties, as such a 
device is often delicate to model and treat. There- 
fore, alternative tracks may be followed. 

A second possibility is to restrict the Maxwell 
equation to a bounded domain. In turn, this option 
divides in two: taking as the domain for the Maxwell 
equations that occupied by the fluid, or choosing a 
domain larger than €. We cannot discuss this choice 
without loss of generality, and refer the reader to the 
literature (see e.g., Gerbeau et al. (2005)). In either 
situation, boundary conditions are needed. We only 
consider the former for the sake of brevity. 

A standard choice for the boundary conditions for 
[12] is the following: 


E x nag = k x nag 17 
B -ng = 
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where k and q, respectively, are given vector and 
scalar functions on the boundary. 

A fact that needs to be emphasized is that it is not 
so easy to design accurate boundary conditions, that 
is, evaluations of k or g, especially because accurate 
experimental measures of magnetic quantities are 
often delicate to obtain, especially in industrial 
environments. 


A Commonly Used Simplified MHD Coupling 


For the terrestrial MHD applications that are the 
focus of the present article, a commonly used 
assumption is to neglect the first term O(cE)/Ot, 
often called the displacement current, in the 
Maxwell-Ampére equation [6], that is the first 
equation of [12] or the third of [16] above. 
Then system [16] can be reorganized, eliminating 
E and j, and leaving aside the Maxwell-Faraday 
equation [8], Ohm's law [15], and the Maxwell- 
Coulomb equation [7]. The latter equations 
amount to defining, respectively, E from B, 
from E and B, and p. from E. One is left with 
the following system with the triple of unknown 
fields (u, p, B) 


Ou 


p Ot ext 


1 
+ pu - Vu — nAu + Vp ——curl Bx B+f 
U 


divu = 0 [18] 


E + curl C curl 5B) = curl(u x B) 
Ot c m 
div B = 0 


Correspondingly, the initial conditions are now 
only on the pair (u,B). Regarding the boundary 
conditions on B, they can be derived from [17] 
using, for example, a homogeneous Dirichlet bound- 
ary condition on 4: 


curl B X Hoo = k X noo d 9| 
B -ng =q 


Other simplifications of system [16] can be 
adopted, such as steady-state approximations. In 
particular, it is often considered that electromagnetic 
phenomena have characteristic times that are so 
short in comparison with the characteristic time 
of hydrodynamics phenomena that the Maxwell 
equations in their stationary form may be coupled to 
the time-dependent hydrodynamics equations, such 
as |5]. We refer to the “Further reading” section 
for further information along these lines (see e.g., 


Gerbeau et al. (2005)). 


The Mathematical Nature of the 
Equations 


With a view to understand the mathematical 
nature of systems [16] and [18], we first briefly 
recall some mathematical facts concerning hydro- 
dynamics, before focusing on the coupling with 
electromagnetics. 

Regarding the incompressible Navier-Stokes 
equation, we recall that the state of the art of the 
mathematical knowledge heavily depends on the 
dimension of the ambient space. In dimension 2, 
solutions are unique and regular (they are said to be 
strong), for regular enough data of course. Unfortu- 
nately, as the focus is here on MHD and electro- 
magnetism is fundamentally a three-dimensional 
phenomenon, only the three-dimensional case for 
the Navier-Stokes equation is relevant. Now, in the 
context of the Navier-Stokes equations alone, only 
the existence of weak solutions for large times, and 
the existence and uniqueness of strong solutions for 
small times are known. Whether or not there exists a 
unique strong solution for all time (of course again 
for sufficiently regular data) is an open problem, of 
outstanding difficulty, (see Temam 1995). 

In the coupled setting examined here, there is no 
reason to expect a better situation. At best, one may 
hope for the same situation as that for the 
uncoupled case (Navier-Stokes equations alone). 
Regarding the existence and uniqueness of solutions, 
a commonly used strategy is that of regularization: 
the Cauchy problem is studied for regularized data, 
and then one passes to the limit in the regulariza- 
tion. In this latter step, the linear terms cause no 
difficulty, since they pass to the limit only using 
weak convergence. On the other hand, the main 
concern is always the treatment of the nonlinear 
terms, which require strong convergence. Here, for 
the Navier-Stokes equation in the MHD setting, the 
additional difficulty stems from the presence of 
the nonlinear term j x B on the right-hand side. The 
mathematical treatment of this nonlinear term calls 
for a compactness argument, which in turn requires 
obtaining some information on the fields j and B, 
and their derivatives, from the Maxwell equations. 
In this respect, the situation is radically different for 
system [16] and for system [18]. Likewise, these 
two systems behave differently regarding the other 
nonlinear term of electromagnetic nature, namely 
u x B in Ohm’s law, or curl(u x B) on the right- 
hand side of the equation in B, respectively. 


The Hyperbolic Variant 


Due to the presence of the Maxwell equations [12] 
in their general form, that is a hyperbolic form, 


system [16] is indeed very difficult, from the 
standpoint of mathematical analysis. 

In order to realize this, it suffices to recall that the 
first step in the proof of the existence of solution to 
such a system of equations is to write down an 
a priori energy estimate. It is a simple manipulation 
on [16] to show that, formally, a solution to [16] 


satisfies 


jg | pnt +n f vut = [xBu — Qo 


multiplying the Navier-Stokes equation by uw and 
integrating over the domain Q, while, on the other 


hand, 


sal [Ep *35 ] 8 P=-/j-E gn 


multiplying the Maxwell-Ampeére equation by —E, 
the Maxwell-Faraday equation by (1/5)B, integrat- 
ing over 2, and summing up the two. Next, the 
right-hand side of [21] can be modified, accounting 
for Ohm's law: 


id id Fia 
sg fe 5 | B 


=- | uP - [xm 22] 


up [20] and [22] yields the energy 


1 d 
za h (on ;P) 
a v 3 
-+ J uf IVu|* = 0 [23] 
JQ F Q 


Notice that, in the above, we set the external forces 
and all boundary conditions to zero, for the sake of 
simplicity. 

Estimate [23] clearly indicates that we dispose of 
L*([0, T], L^(Q0)) bounds on the vector fields E and 
B together with an L^([0, T] x Q) bound on the 
current j, and with the (classical) L*([0, T], 
L7(Q)) N L^([0, T], H'(Q)) bounds on the velocity 
4. In addition, divB and, when assuming pe 
bounded, divE are bounded in L*([0, T] x). 
Unfortunately, these bounds do not allow for 
passing to the limit in the nonlinear term j x B on 
the right-hand side of the Navier-Stokes equation. 
In addition, there seems to be no way of deriving 
further energy estimates on system [16] that would 
provide with more a priori regularity on the fields 
E,B, and j. To date, system [16] presents an 
unsolved mathematical difficulty. 


Summing 
estimate: 
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The Parabolic Variant 


On the other hand, system [18] is radically different 
in mathematical nature, because the Maxwell 
equations then reduce to a parabolic-type equation. 
The same manipulations as above, in order to 
establish a priori estimates on the solution of [18], 
now lead to 


DA fF... thes 

rg | (Pu 29 
1 

+f curl (7 B) 
RC 


which, together with the divergence-free constraint 
on B, yields L*([0, T], L^(0)) n L"([0, T], H' (Q)) 
bounds on both the velocity 4 and the magnetic 
field B. These bounds now allow for passing to the 
limit in the terms curl B x B and curl(u x B) on the 
right-hand side of the equations. This being estab- 
lished, the rest of the mathematical analysis is 
straightforward, and a theorem of existence and 
uniqueness of solutions can be proved. Like in the 
case of the Navier-Stokes equations alone, we have 
(in dimension 3) the existence of a global-in-time 
weak solution (ie. for any T,w and B both 
L(G, TI L^00y) AL ([0, T], H'(Q)) satisfying the 
divergence-free constraint). No uniqueness of this 
weak solution is known. On the other hand, for 
sufficiently regular data, we have the existence of a 
local-in-time strong solution (i.e., for T sufficiently 
small, u and B both L*([0, T], H' (Q)) n L?([0, T], 
H^(Q)), and uniqueness of this strong solution in 
the class of weak solutions as long as it exists. We 
refer to Sermange and Temam, (1983) and Gerbeau 
et al. (2005). 

At this stage, it is to be remarked that there is a 
formal similarity, at first sight at least, between 
the parabolic form of the Maxwell equations, 
namely 


n [ Wu? =0 = [24] 


OB 
Bt + curl curl B = curl h 25] 


div B — 0 


and the incompressible Navier-Stokes equation [5]. 
Note that indeed the curl operator in the first 
equation of [25] can be replaced by (minus) the 
Laplacian operator —A, since div B— O0. Actually, 
this formal similarity cannot be translated into 
mathematical arguments, simply because there is 
no pressure in [25]. In other terms, the divergence- 
free constraint div B — 0 simply propagates in time in 
[25] (note that the right-hand side curlh is also 
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divergence-free by construction), while on the other 
hand div 4 — 0 is enforced as a constraint in [5], the 
pressure playing the role of a Lagrange multiplier 
that adjusts itself in time in order to allow for u to 
be divergence-free. 

Of course, as in the purely hydrodynamics case, 
much more can be said on the equations than simply 
establishing the existence and uniqueness of solu- 
tions. For instance, the long time limit of the 
solutions can be studied, etc.... For this and other 
issues, we refer to the “Further reading” section 
(Duvaut and Lions 1972a, b, Sermange and Temam 
1983, Gerbeau et al. 2005). 


Numerical Issues 


We concentrate again on system [18]. It is illustra- 
tive to mention that this system, when written in 
nondimensional variables, reads 

Ou 


1 
op «Vg; Aut Vp = Scul Bx Bf 


ext 


divu = 0 


sal 
Ot Rémag 


curl (curl B) = curl(u x B) 
div B — 0 


where S is the coupling parameter, Re is the 
(hydrodynamic) Reynolds number, and Reg; 
denotes the magnetic Reynolds number. 

As expected, the numerical simulation of a system 
such as [18] superposes the difficulties of the 
hydrodynamics simulation of incompressible viscous 
fluids, and those faced when simulating the para- 
bolic form of the Maxwell equations. Therefore, the 
goal is to efficiently combine the techniques 
employed to overcome either of them. 

For incompressible fluid mechanics, the method 
of choice is the finite-element method for the 
discretization of differential operators in space. A 
typical discretization of eqn [5], called the *mixed" 
finite-element method, makes use of a pair of finite 
elements, one for the velocity, and one for the 
pressure. Other possibilities exist, that amount 
more or less in eliminating one unknown in a 
first stage and calculating the second one as a 
postprocessing task. The mixed formulation in the 
pair of unknowns (u,p) is however the most 
employed method to date, at least in the present 
setting. The finite-element space for the velocity is 
taken richer than that for the pressure: a possibility 
is, for example, to take the degree of the finite 


element for the velocity equal to the degree of the 
finite element for the pressure plus one. The 
heuristics for this is the fact that the velocity is 
derived twice in [5] while the pressure is only 
derived once. Of course, a mathematical ground 
for this is available, and a key issue is the “inf- 
sup" condition (also compatibility condition, or 
stability con." ion) that dictates the possible choice 
for finite-elements pairs, so that problem [5] is well 
posed at the discrete level. Typically, O2 finite 
elements for the velocity can be combined with 
(continuous) O1 finite elements for the pressure. 
An alternative choice is to ignore the inf-sup 
condition, adopting, for example, O1 finite ele- 
ments for both fields u and p, but this requires for 
a so-called stabilized formulation of [5] at the 
discrete level. The “Further reading" section 
provides details on the broad variety of techniques 
available in the field: Quarteroni and Valli (1997), 
Gerbeau et al. (2005). 

On the other hand, the parabolic equation on B in 
[18] may be discretized with the same finite elements 
as those used for the velocity. The enforcement of 
the divergence constraint div B—O at the discrete 
level deserves some attention. Recall indeed that 
at the continuous level the divergence-free constraint 
is spontaneously propagated by the equation. At 
the discrete level, a crucial role in this respect is 
played by the weak formulation of the parabolic 
equation and an ad boc account for the boundary 
condition [17]. 

For the sake of completeness, let us mention that 
an alternative strategy to the use of the finite 
elements that have been mentioned above (and that 
are called Lagrangian finite elements), is to use 
"edge elements." In some sense, the use of such 
elements simplifies the treatment of the boundary 
conditions [17], since they are very well adapted to 
their mathematical nature. 

Note also that, in the vein of what is done for 
purely hydrodynamics flow simulations, stabilized 
finite-elements techniques have been developed for 
the MHD system [18], that allow for a discretization 
of the three unknown fields (u, p, B) over the same 
finite elements, for example, O1. 

When coupling the two discrete formulations for 
simulating the whole system [18], two main strate- 
gies can be adopted: one can either treat each of the 
two equations separately, independently describing 
the propagation of u and B forward in time, or one 
can address directly the coupled system of equa- 
tions, describing the propagation of u and B in 
parallel. 

The first option aims in particular at obtaining in 
the end small algebraic systems. An instance of such 


a segregated algorithm reads, formally and setting 
all constants to unity for simplicity, 


yn”! — wh 


At 
= curl B" x B" +f... 


+ y”. Qu”! » Au"! "1 Vp" 


div 4"^! —0 


p^! san B” 
At 
= curl(u” x id 


divB"*! =0 


+ curl curlB”*! 


At each time step, the two independent subsystems 
are solved, providing with #”*! and B"'! for the 
next time step. The difficulty is that it is not 
possible, with such segregated algorithms, to repro- 
duce the energy estimate [24] at the discrete level. 
Note that, at the continuous level, the estimate [24] 
is based upon a proper cancelation of the term 
Jo i x B): u present on the two right-hand sides. 
Such a cancelation basically stems for a nonlinear 
interplay that cannot be present in a segregated 
iteration. Consequently, some spurious energy is 
created in the system simply by an inadequate 
iteration between the two equations. More precisely, 
the scheme obtained is at best only conditionally 
stable, that is, stable for small enough time steps, a 
condition that might be prohibitive when it is 
needed to simulate the MHD coupling over large 
times. 

On the other hand, the other option consists in 
attacking the full system [18] directly: 


u^"! — y 
At 
- nl n 
=curl B" x B" + fox, 


div u”*'—0 


+ u”. Vy! u Au"! US vp! 


[27] 
B” MO B" 
At 


= curl (u"*' x B") 


div B"! —0 


+ curl curl B"*'! 


Note that B’*' is present in the equation yielding 
u"*!, while conversely w"'' is present in that 
yielding B"*!. Then the coupled system admits at 
the discrete level an energy estimate analogous to 
the energy estimate [24], and the scheme is much 
more stable than the previous one, and even 
unconditionally stable. The price to pay is that the 


system is, at the algebraic level, of very large size. 
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Being sparse, it may however be treated, for 
example, via a GMRES-type iterative solver. 

Let us make a final remark on these numerical 
issues. In the whole generality, the numerical 
simulation of viscous fluids raises the question of 
large Reynolds numbers, that is, the question of the 
difficulties encountered in the numerical approxi- 
mation for viscosities 7 small with respect to the 
other dimensionalized parameters of the problem 
(density, velocity, and dimension of the domain). 
For such small viscosities, the flow becomes 
turbulent rather than laminar, and the broad 
range of length and energy scales in the flow turns 
out to be too difficult to capture numerically. A 
commonly used technique that is resorted to in 
such difficult cases is the turbulence modeling. 
Schematically, an averaged, or homogenized, model 
is derived on the basis of the Navier-Stokes 
equation, with the help of simplifying hypotheses, 
for example, in the form of closure relations. The 
quality of the simulation of the averaged model, 
and its relation to the true flow, heavily depends on 
these simplifying assumptions, which are in turn 
based upon a very deep understanding on the 
various physical phenomena at play. In the context 
of MHD flows, the situation is not clear, regarding 
such assumptions. It seems that there are no well- 
established models for turbulent MHD to date, at 
least from a rigorous viewpoint. In the absence of 
those, only a direct simulation of the Navier-Stokes 
equation seems possible. 


The Industrial Production of Aluminium 


A prototypical example of an application of MHD 
to the industrial context is the production of 
aluminum in electrolysis cells. The numerical simu- 
lation of the process involves the simulation of the 
evolution of two layers of nonmiscible incompres- 
sible viscous fluids, separated by an interface, and 
covered by a free surface. A schematic description of 
an industrial cell indeed is the following. An electric 
current of 10°A, or more, runs through two 
horizontal layers of conducting fluids: a bath of 
aluminum oxide above, and a layer of liquid 
aluminum below. The aluminum is produced by 
the reduction of the aluminum oxide, a reaction that 
only occurs at a temperature where aluminum is 
liquid. The high magnetic field induced by such a 
huge current produces in turn high Lorentz forces 
that influence the motion of either fluid. A key issue 
in the modeling, as well as in the technological 
control of the cell, is to understand the motion of 
the interface separating the two fluids. In a rough 
picture, this interface may be seen as a mobile 
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cathode, moving below a fixed anode. The equa- 
tions describing the interior of the cell are basically 
of the type [18], with an important modification 
though: one needs to account for the presence of 
two fluids. They read: 


Mom) |. div(pu@u)—div(n( Va + (Vw)) 
= -Vp + pg e curl B x B 
L 
divu = 0 28 

OD x. u 

a + div(pu) — 0 
OB + curl (Z curlB ) = curl(u x B) 
Ot [Lo 


divB — 0 


where g denotes the gravity field, we recall, and are 
supplied with the boundary conditions 


u —0 
1 | 
— curlB x Hoo = kx Hoo [29] 
Ho 
B.n = 4 


As opposed to [18], the density p in [28] is no longer 
the constant p, but is only piecewise constant, that is, 
constant in each (moving) subdomain occupied by 
each fluid. Likewise, the viscosity 7, and the con- 
ductivity ø are taken constant in each fluid, but with 
different values from one fluid to the other. While the 
density and the viscosity are only slightly different, the 
conductivity varies from many orders of magnitude, a 
discrepancy which ends up in some numerical stiffness 
of the equations. On the other hand, the permeability 
p can be considered as constant throughout the 
domain, within a good level of approximation. 
Mathematically, system [28] is an order of magni- 
tude more difficult than |18]. We refer to Lions 
(1996) and Gerbeau and LeBris (1997) for some 
mathematical ingredients. A first major difficulty 
stems from the fact that the domain occupied by the 
fluids is no longer fixed. Notice that this difficulty 
already arises when simulating the MHD of one 
conducting fluid with a free surface. A second major 
difficulty is the discontinuity of the physical para- 
meters at the interface, which causes a loss of 
regularity at the interface for the solution fields. The 
best result known to date is the existence of a global- 
in-time weak solution to [28]. Both mathematical 
difficulties above of course have significant numerical 
counterparts. A notable issue in such a simulation is 
how to handle the motion of the free interface, while 
ensuring that each fluid remains of constant mass (or 


volume) throughout the simulation. One of the most 
efficient method in such a context, introduced three 
decades ago, is the arbitrary-Lagrangian Eulerian 
(ALE) method. We refer to Brackbill and Pracht 
(1973) and Gerbeau et al. (2003a, b, 2005). 

Apart from the direct numerical attack of system 
[28], which carries significant analytical and geome- 
trical nonlinearities, there is the possibility, in 
particular in the industrial context, to derive a set 
of linearized equations at the vicinity of some 
equilibrium configuration of the system. This track 
has been extensively followed in the past and 
provides information that efficiently complement 
those provided by the much more satisfactory, but 
also more costly, nonlinear approach. 


See also: Compressible Flows: Mathematical Theory; 
Computational Methods in General Relativity: The Theory; 
Fluid Mechanics: Numerical Methods; Newtonian Fluids 
and Thermohydraulics; Partial Differential Equations: 
Some Examples; Stability of Flows; Symmetric 
Hyperbolic Systems and Shock Waves; Topological Knot 
Theory and Macroscopic Physics. 
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Introduction 


Malliavin calculus was initiated in 1976 with the 
work by P Malliavin (1978) and is essentially an 
infinite-dimensional differential calculus on the 
Wiener space. Its initial goal was to give conditions 
ensuring that the law of a random variable has a 
density with respect to Lebesgue measure as well as 
estimates for this density and its derivatives. When 
the random variables are solutions of stochastic 
differential equations (SDEs), these densities are heat 
kernels and Malliavin used  Hórmander-type 
assumptions on the corresponding operators, thus 
providing a probabilistic proof of a Hórmander-type 
theorem for hypoelliptic operators. 

The theory was much developed in the 1980s by 
Stroock, Bismut, and Watanabe, among others (the 
reader is referred to Nualart (1995) and Malliavin 
(1997)). In recent years, Malliavin calculus had 
great success in probabilistic numerical methods, 
mainly in the field of stochastic finance (Malliavin 
and Thalmaier 2005). However, the theory has also 
been applied to other fields of mathematics and 
physics, notably in statistical mechanics and statistical 
hydrodynamics (see Stochastic Hydrodynamics). In 
addition, one should remember that Wiener measure 
can be viewed as an "imaginary time" (but well- 
defined) counterpart of Feynman's *measure" for 
quantum systems. A stochastic calculus of variations 
for Wiener functionals could not be irrelevant to the 
path-integral approach to quantum theory. 

Another field of application worth mentioning is 
the study of representations of stochastic oscillatory 
integrals with quadratic phase function and their 
stationary phase estimation. For this, complexifica- 
tion of the Wiener space must be properly defined 
(Malliavin and Taniguchi (1997)). 
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In order to give a flavor of what Malliavin 
calculus is all about, let us consider a second-order 
differential operator in R^ of the form 


A= +> a" OF. + >» b'ð; 
TS i 


with smooth bounded coefficients and such that 
the matrix a is symmetric and non-negative, admit- 
ting a square root c. The corresponding Cauchy 
value problem consists in finding a smooth solution 
u(t,x) of 


Ou — 

Ot — 
Then there exists a transition probability function 
p(t,x,.) such that 


u(t,x) = i , 9Q)p(t, x, dy) 


When p(t,x,dy) — p(t, x, y)dy, the function p is the 
heat kernel associated to the operator A, and 
from eqn [1] one may deduce Focker-Planck's 
equation for p. 

Since Kolmogorov we know that it is possible to 
associate with such a second-order operator a stochas- 
tic family of curves like a deterministic flow is 
associated with a vector field. This stochastic family 
is a Markov process, €,(t), which is adapted to the 
increasing family P,,7 € [0, 1], of sigma-fields gener- 
ated by the past events, that is, u(t) € P, for every 7. 

Itó calculus allows us to write the SDE 
satisfied by £: 


d£(t) =o(€,(t))dW(t) + b(&.(t)) dt, &(0)=x [2] 


where W(t) stands for R^-valued Brownian motion 
(see Stochastic Differential Equations). Then p is the 
image of the Wiener measure jz (the law of 
Brownian motion), namely p(t,x,.) — o£. (t)(.) 
and we have the representation 


u(t,x) = E,,(@(E.(t))) 


Au, u(0,.) = o.) [1] 
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The following criterion for absolute continuity of 
measures in finite dimensions holds: 


Lemma If y is a probability measure on R^ and, 
for every f € Cx, 


J af dy « cilii. 


where cj, 1— 1,...,d, are constants, then ^, is abso- 
lutely continuous with respect to Lebesgue measure. 


Now one can think about Wiener measure as an 
infinite (actually continuous) product of finite- 
dimensional Gaussian measures. Considering the 
toy model of the above-mentionned situation in 
un "m we replace Wiener measure by 

= e 7" /? dx and look at the process at 
a dd time as a function g on R. In order to apply 
the lemma and study the law of g, one would write 


and then integrate by parts to obtain | (f o g)p ds. A 
simple computation shows that p(x) = (g + xg/)/(g/)^, 
and, in particular, that the Bn itn, of the 
derivative of g plays a role in the existence of the 
density. 

To work with functionals on the Wiener space, 
one needs an infinite-dimensional calculus. Of 
course, other (Gateaux, Fréchet) calculi on infinite- 
dimensional settings are already available but the 
typical functionals we are dealing with, solutions of 
SDEs, are not continuous with respect to the 
underlying topology, nor even defined at every 
point, but only almost everywhere. Malliavin calcu- 
lus, as a Sobolev differential calculus, requires very 
little regularity, given that there is no Sobolev 
imbedding theory in infinite dimensions. 


Differential Calculus on the 
Wiener Space 


We restrict ourselves to the classical Wiener space, 
although the theory may be developed in abstract 
Wiener spaces, in the sense of Gross. For a 
description of this theory as well as of Segal’s 
model developed in the 1950s for the needs of 
quantum field theory, the reader is referred to 
Malliavin (1997). 

Let # be the  Cameron-Martin space, 
Hi: E 1] - — R? such that 5 is square integrable 
and h(t)= = fh rìdr}, which is a separable ji 
space amr. scalar product «bibi» = f hy 
h>(r)dr. The classical Wiener measure vill [^ 
denoted by ji it is realized on the Banach space X 


of continuous paths on the time interval [0,1] 
starting from zero at time zero, a space where H is 
densely imbedded. In finite dimensions, Lebesgue 
measure can be characterized by its invariance under 
the group of translations. In infinite dimensions 
there is no Lebesgue measure and this invariance 
must be replaced by quasi-invariance for transla- 
tions of Wiener measures (Cameron-Martin admis- 
sible shifts). We recall that, if b € H, Cameron- 
Martin theorem states that 


where du denotes Itó integration. 

For a cylindrical “test” functional F(w)= 
f(w(ri), ...,w(T4), where f € CF(R") and 0< 
Tj €: X T4, € 1, the derivative operator is 


defined by 


ni 


= 2a dren def it Ti), 


This operator is closed in W2,1(X; R), the comple- 
tion of the space of cylindrical functionals with 
respect to the Sobolev norm 


;U(Ta)) [3] 


* 
||Fll , = E,IIFII? + E, | ID, Ff. dr 


Define F to be H-differentiable at w € X when there 
exists a linear operator V F(u) such that, for all h € H, 


Fw +h) — F(w) = (VF(u), b) + o(llbllg) 


as ||b|| ^ 0 


Then D, disintegrates the derivative in the sense that 


| 
D Flu) = (EDU b = H D,F(w)b(r)dr 4 
0 

Higher (r)-order derivatives, as r-linear functionals, 
can be considered as well in suitable Sobolev spaces. 
Denote by ô the L - adjoint of the operator V, that 
Is, for a process u: x H in the domain of 6, the 

divergence 4(u) is characterized by 


E, (Feu) = E, ( / D, Fùlr) dr) [5] 


For an elementary process u of the form 
T) — X; F(t A7;), where the F; are smooth ran- 
dom variables and the sum is finite, the divergence is 


= 2 Frola) — "» | D.E;dr 


The characterization of the domain of 6 is delicate, 
since both terms in this last expression are not 
independently closable. It can be shown that 
Wi»5(X;H) is in the domain of 6 and that the 
following “energy” identity holds: 


1 pl 
E (6(u))* = E, ||u||z + E, i J D,ü,.Dsü, do dr 
JO 0 


Notice that when u is adapted to P}, Cameron- 
Martin-Girsanov theorem implies that the ee sen 
coincides with It6 stochastic integral IT T) du(7) 
and, in this adapted case, the last term of "ad energy 
identity vanishes. We recover the well-known Ito 
isometry which is at the foundation of the construction 
of this integral. When the process is not adapted, the 
divergence turns out to coincide with a generalization 
of Itó integral, first defined by Skorohod. 

The relation [5] is an integration-by-parts formula 
with respect to the Wiener measure u, one of the 
basic ingredients of Malliavin calculus. This formula 
is easily generalized when the base measure is 
absolutely continuous with respect to p. 

Considering all functionals of the form 
P(w) = Q(w(n), ..., w(Tm)) with Oa polynomial on 
R7. the Wiener chaos of order n, C,, is defined as 
Cn =P RQP, where P, denote the polynomials 
on X of degree <n. The Wiener-chaos decomposition 
L?(X )- @% 4C, holds. Denoting by IL, the ortho- 
gonal projection onto the chaos of order n, we have 


(v (I) h) = |EGvE o 


The derivative D,, corresponds to the annihilation 
operator A(z) and the divergence ó(u) to the creation 
operator A* (u) on bosonic Fock spaces. 

An important result, known as the Clark- 
Bismut-Ocone formula, states that any functional 
F € Wi 5(X; R) can be represented as 


1 
F = EF) + f E,(D,F) dw(r) 
Jo 
where E, denotes the conditional expectation with 
respect to the events prior to time 7 (or, for short, 
the past P, of 7). 

The Ornstein-Uhlenbeck generator (or minus 
number operator) is defined by CF= —óVF. On 
cylindrical functionals F(w)=f(w(7),..., w(Tn)), it 


has the form 
2 Ti ^ 0j0; Ti Js sis s 


-Zu 7;)0;f (w(71) 
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where i,j denote multi-dimensional (d) indexes. 

As a multiplicative operator on the Wiener-chaos 
decomposition LF = —95/, nIL,F. It is the generator 
of a positive -self-adjoint semigroup, the Ornstein- 
Uhlenbeck semigroup, formally given by 
T,F— $ „e "IL. Another familiar representation 
of this semigroup is Mehler formula, 


TF) E, (F Ga j Vae) dulo) ) 


Considering the map X — R”, w — (w(m),..., 
W(Tm)), the image of this operator is the Ornstein- 
Uhlenbeck generator (corresponding to the Langevin 
equation) on R” with Euclidean metric defined by 
the matrix 7; ^ 7;. 

The fundamental theorem concerning existence of 
the density laws of Wiener functionals is the following: 


Theorem Let F be an R?-valued Wiener functional 
such that F' and £F belong to Ei for every 
i= 1,..., d. If the covariance matrix 


(VF, VF, 


is almost surely invertible, then the law of F is 
absolutely continuous witb respect to tbe Lebesgue 
measure on R’. 


Under more regularity assumptions, smoothness 
of the density is also derived. On the other hand, the 
integrability assumptions on £ can be replaced by 
integrability of the second derivatives, due to Krée- 
Meyer inequalities on the Wiener space. 

We remark that, although equivalent, the initial 
formulation (Malliavin 1978) of Malliavin calculus 
was different, relying on the construction of the 
two-parameter process associated to £ and on its 
properties. In the early 1980s, the theory was 
elaborated, the main applications being the study 
of heat kernels (cf., e.g., Stroock (1981), Ikeda and 
Watanabe (1989), and Bismut (1984)). Starting from 
an SDE [2], it is possible to apply these techniques to 
obtain existence and smoothness of the transition 
probability function p(t,x,y) if the vector fields 
Zi = Yo" (0/0x;) together with their Lie brackets 
generate the tangent space for “sufficientely many” 
(in terms of probability) paths. These results shed a 
new light on Hormander theorem for partial 
differential equations. 


Quasi-Sure Analysis 


Quasi-sure analysis is a refinement of classical 
probability theory and, generally speaking, replaces 
the fact that, due to Sobolev imbedding theorems, 
functions in finite dimensions belonging to Sobolev 
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classes are in fact smooth. We work in classical 
probability up to sets of probability zero; in quasi- 
sure analysis negligible sets are smaller and are those 
of capacity zero. This is the class of sets which are 
not charged by any measure of finite energy. 

Under a nondegenerate map, Wiener measure and 
more general Gaussian measures may be disinte- 
grated through a co-area formula. This principle, 
developed by  Malliavin and co-authors (cf. 
Malliavin (1997) and references therein), implies 
that a property which is true quasi-surely will also 
hold true almost surely under conditioning by such 
a map. One can use this principle to study 
finer properties of SDEs. It was also used in 
M P Malliavin and P Malliavin (1990) to transfer 
properties from path to loop groups (see Measure on 
Loop Spaces) A pinned Brownian motion, for 
example, is well defined in quasi-sure analysis. It is 
possible to treat anticipative problems using quasi- 
sure analysis by solving the adapted problem after 
restriction of the solution to the finite-codimensional 
manifold which describes the anticipativity. These 
methods have also been applied to the computation 
of Lyapunov exponents of stochastic dynamical 
systems (Imkeller 1998). With a geometry of finite- 
codimensional manifolds of Wiener spaces well 
established, it is reasonable to think about applica- 
tions to cases where such submanifolds correspond to 
level surfaces of invariant quantities for infinite- 
dimensional dynamical systems (cf. Cipriano (1999) 
for an example of such a situation in hydrodynamics). 

The (p,r)-capacity of an open subset O of the 
Whener space is defined by 


capy,(O) = inf{||¢| 


and, for a general set B, cap, ,(B) = inf (cap, ,(O) : 
B C O,O open]. A set is said to be slim if all its 
(p, r)-capacities are zero. For ® € W, the space of 
functionals with every Malliavin derivative belong- 
ing to all L^, there exists a redefinition of 9, 
denoted by ®*, which is smooth and defined on the 
complement of a slim set. 

Following Airault and Malliavin (1988), let G € 
W..(X; R7) be of maximal rank and nondegenerate 
in the sense that the inverse of 


o d 20,0 > 1 p-a.s. on O} 


(det P) (w) = det((V®'(w), V®/(w))) 


belongs to Wæ. Then for every functional G € W,, 
the measures po 9! and (Gy) o 6^! are absolutely 
continuous with respect to Lebesgue measure on R? 
and have C? Radon-Nikodym derivatives. If 


| duoc" d(Gu) o 9^! 
BAUR E m 


and pc(A) = dà 


the function A — pcG(A)/p(A) will be smooth in the 
open set O — [A:p(A) > 0]. 

For every A € O, it is possible to define (up to slim 
sets) a submanifold of the Wiener space of codimen- 
sion d, S,=(#*) (A), as well as a measure js 
satisfying 


f. © dunste) = EMG) = 6522 
SS, p() 
for every G € Wæ. This measure does not charge 
slim sets. 
The area measure N on the submanifold S, is 


defined by 


J F* dX = p(A) / F*(w) det((V4'(u), 
Và/(4)))"? dus(w) 


The following co-area formula on the Wiener 
space 


| f(©(w))F(w) (det 9)(u) du(w) 
= . F(X) J. P(e) Ru) dA 


was proved in Airault and Malliavin (1988). 


Calculus of Variations in a 
Non-Euclidean Setting 


Let M be a d-dimensional compact Riemannian 
manifold with metric ds? = `; gi; dm' dm/. The 
Laplace-Beltrami operator is expressed in the local 
chart by 


n2 

Pl g'irt OF 
Om' Om I Omk 
where T$, are the Christoffel symbols associated 
with the Levi-Civita connection. The corresponding 


Brownian motion pw is locally expressed as a 
solution of the SDE: 


dp'(t) = a" (p(t)) aW;(t) — 3s ^T; ,(p(t)) dt 


with p(0) —7 € M and where a= Jg. Its law on 
the space of paths P(M)— (p:[0, 1] —^ M,p contin- 
uous, p(0) = mo} will be denoted by v. 

How can we develop differential calculus and 
geometry on the space P(M)? An infinite-dimensional 
local chart approach is delicate, due to the difficulty 
of finding an atlas in which the changes of charts 
preserve the measures. A possibility, developed in 
Cruzeiro and Malliavin (1996), consists in replacing 
the local chart approach by the Cartan-like metho- 
dology of moving frames. The canonical moving 


Ay = gl 


frame in this framework is provided by Itó stochastic 
parallel transport. Nevertheless, a new difficulty 
arises: the parallel transport will not be differentiable 
in the Cameron-Martin sense described before. 

Recall that a frame above m is a Euclidean 
isometry r: R^ — T,,(M) onto the tangent space. 
O(M) denotes the collection of all frames above M 
and z(r) =m the canonical projection. O(M) can be 
viewed as a parallelized manifold for there exist 
canonical differential forms (6, w) realizing for every r 
an isomorphism between T,(O(M)) and R4 x so(d). 

If A,,a=1,...,d, denote the horizontal vector 
fields, which are defined by <0,A,>=€ 9, <W, 
A, > =0, where £, are the vectors of the canonical 
basis of IR“, then the horizontal Laplacian in O(M) 
is the operator 


and we have Ag m)(foT)=(Amf)or. With the 
Laplacians on M and on O(M) inducing two 
probability measures, the canonical projection rea- 
lizes an isomorphism between the corresponding 


probability spaces. 
The Stratonovich SDE 


dr, = Y  A«(r;)odu^, 1,,(0) = ro 


with z(ro) — 79 defines the lifting to O(M) of the Itó 
parallel transport along the Brownian curve and we 
write ?^. yro—r,(r). Itó map was defined by 
Malliavin as the map I: X — P(M) given by 


I(w)(7) = n(ri(T)) 


This map is a.s. bijective and we have v=po It; 
therefore, it provides an isomorphism of measures 
from the curved path space to the “flat” Wiener 
space. 

For a cylindrical functional F = f(p(ri), ..., p(7,,)) 
on P(M), the derivatives are defined by 


ni 


D.4F(p) = » ocn Uf. s (OpF )|Ea) 


k=1 


The derivative operator is closable in a suitable 
Sobolev space. 

It would be reasonable to think that the differ- 
entiable structure considered in the Wiener space 
would be conserved through the isomorphism J and 
that the tangent space of P(M) would consist of 
transported vectors from the tangent space to X, 
namely Cameron-Martin vectors. Let us take a map 
Zp(T) € Ty (M) such that str) -Zp(7) belongs 
to the Cameron-Martin space H. 
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In order to transfer derivatives to the Wiener 
space, we need to differentiate the Itó map. We have 
(Cruzeiro and Malliavin (1996)): 


Theorem The Jacobian matrix of the flow ro ^ 
r7) is given by the linear map J...;= ÁA ood € 
GL(R x so(d)) defined by the system of Stratonovich 
SDE's 


dt, = Y t.) odas 


a-1 
df, = Enl ses )odaut?) 
a= 


where €) denotes the curvature tensor of the under- 
lying manifold read on the frame bundle. 


From this result we can deduce the behavior of the 
derivatives transferred to the Wiener space, a result 
whose origin is due to B Driver. We have, for a 
"vector field” Z,(7) on P(M) as above, 


(DzF)ol = D¢(Fol) 
with € solving 


d£(7) = z(r) dr + po du(7) 
dp(7) = Q(0 dur), z(7)) 


The process € is no longer Cameron—Martin space 
valued. Nevertheless, it satisfies an SDE with an 
antisymmetric diffusion coefficient (given by the 
curvature) and therefore, by Levy’s theorem, it still 
corresponds to a transformation of the Wiener space 
that leaves the measure quasi-invariant. We extend, 
accordingly, the notion of tangent space in the 
Wiener space to include processes of the form 
d£" =a du^--c"dr, with a}+a%=0. These 
were called “tangent processes” in Cruzeiro and 
Malliavin (1996). 

Another important consequence of the last theo- 
rem is the integration-by-parts formula in the curved 
setting, initially proved by Bismut (1984): 


l 
EADzF) = E, (Eon) | [z + 3Ricci(z)] d(r) 


where Ricci is the Ricci tensor of M read on the 
frame bundle. 


Some Applications 


We already mentioned that Malliavin calculus has 
been applied to various domains connected with 
physics. We shall describe here some of its relations 
with elementary quantum mechanics. 
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Feynman gave a path space formulation of 
quantum theory whose fundamental tool is the 
concept of transition element of a functional F(w) 
between any two L^-states V; and $,, for paths w 
defined on a time interval [s, u]: 


<F>s =< @|Fly>s 


=f ff ve exp (Sinn - 9) 


x F(w)ó,(z)Dw dx dz [6] 
This is a shorthand for the time discretization 
version along broken paths w interpolating 
linearly between point x;=w(t;), t;=j(u — s)/N, 


j—0, 1,..., N. In [6] 5 is Planck's constant and 
$ — S; denotes the action functional with Lagran- 
gian L of the underlying classical system. For a 
particle with mass m in a scalar potential V on the 
real line, 


- s) = J (E20 - vul) dr [7 


The “Dw” of [6] is used as a Lebesgue measure, 
although there is no such thing in infinite dimen- 
sions. More generally, the construction of measures 
or integrals on the various path spaces required for 
general quantum systems is still nowadays a field of 
investigation. 

When F — 1 and ¢, (the complex conjugate of ¢,) 
reduces to a Dirac mass at z, [6] is the path-integral 
representation of the solution v(x,z) of the initial- 
value problem in L?: 
Tis = Hy 

Ou 


(x, s) = vs(x) 


where H = — (b^ /2)A + V and when Sz is as in [7]. 
Feynman's framework is time symmetric on J: when 
Ws — 6, (still for F— 1), [6] provides a path-integral 
representation of the solution of the final-value 
problem for ó(z, s). 

According to Feynman, *it would be possible to 
use the integration-by-parts formula 


Gum)" sag) P 


as a starting point to define the laws of quantum 
mechanics" (Feynman and Hibbs 1965, p. 173). The 
functional derivative corresponds to variations of 
the underlying paths in directions ów and 


[8] 


óF 


to an L^ analog of [4]. 


Its first consequence, when F=1, is the path 
space counterpart of Newton's law, in the elemen- 
tary case [7], 


«mos, = — «VV(w)»s, [10] 


where the left-hand side involves a time discretiza- 
tion of the second derivative. When F(w)=w(t), 
Feynman obtains the path space version of 
Heisenberg commutation relation between position 
and momentum observables: 


(ole) w(t) — w(t — o B (e + €) — w(t) 0) 
C $, € 
.b 

mb [11] 
and from this the crucial fact that “quantum 
mechanical paths are very irregular. However, these 
irregularities average out over a reasonable length of 
time to produce a reasonable drift or average 
velocity" (Feynman and Hibbs 1965, p. 177). 

A probabilistic interpretation (cf. Cruzeiro 
and Zambrini (1991)) of Feynman’s calculus uses 
(Bernstein) diffusion processes solving the SDE 


1/2 
dz(t) = a dW (t) 5 V log n(z(t),t) dt [12] 


where the drift stems from a positive solution of the 
Euclidean version of the above final-value problem 
for $, 

On 


a Hm 


n(x, u) -— ru (x) 


[13] 


For any regular function f, we can make sense of 
the “continuous limit” 


Df(«(t), t) = lim=E,[f(2(t +6), +6) 
— f(e(t),t))) [14 


where E; denotes conditional expectation with 
respect to the past 7, and check, indeed, that 


Dz(t) = Py log n(z(t), t) 


is Feynman's “reasonable drift." Using Feynman- 
Kac formula, one shows that the diffusions [12] 
have laws which are absolutely continuous with 
respect to the Wiener measure of parameter h/m, 
with Radon-Nikodym density given by 


mau), u) 1f? 
pa) = E p- s Visto) dr) 


We can, therefore, use Malliavin calculus on the 
path space of these diffusions and the associated 
integration-by-parts formula to make sense of [9] 
and all its consequences. 

The probabilistic counterpart of the time symme- 
try of Feynman's framework is interesting: Heisen- 
berg's original argument to deny the existence of 
quantum trajectories (1927) was that any position 
can be associated with two velocities. Feynman's 
interpretation [11] and the definition [14] suggest 
that this has to do with a past or future conditioning 
at time £. Indeed, there is another description of 
diffusions z(t) with respect to a family of future 
o-fields, using the Euclidean version of the initial- 
value problem for v, underlying [6]. Another drift 
built on the model of the drift in [12] results, and 
Feynman's commutation relation [11] becomes 
rigorous (without, of course, the factor ;). 

We refer to Cruzeiro and Zambrini (1991) for a 
development of this approach using Malliavin 
calculus. 


See also: Euclidean Field Theory; Functional Integration 
in Quantum Physics; Measure on Loop Spaces; 
Stochastic Differential Equations; Stochastic 
Hydrodynamics. 
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Introduction 


Characteristic classes play an essential role in the 
study of global properties of vector bundles. 
Particularly important is the Euler class of real 
orientable vector bundles. A de Rham representative 
of the Euler class (for tangent bundles) first 
appeared in Chern's generalization of the Gauss- 
Bonnet theorem to higher dimensions. The repre- 
sentative is the Pfaffian of the curvature, whose 
cohomology class does not depend on the choice of 
connections. The Euler class of a vector bundle is 
also the obstruction to the existence of a nowhere- 
vanishing section. In fact, it is the Poincaré dual of 
the zero set of any section which intersects the zero 
section transversely. In the case of tangent bundles, 
it counts (algebraically) the zeros of a vector field on 
the manifold. That this is equal to the Euler 
characteristic number is known as the Hopf theo- 
rem. Also significant is the Thom class of a vector 
bundle: it is the Poincaré dual of the zero section in 
the total space. It induces, by a cup product, the 
Thom isomorphism between the cohomology of the 
base space and that of the total space with compact 
vertical support. Thom isomorphism also exists and 
plays an important role in K-theory. 

Mathai and Quillen (1986) obtained a represen- 
tative of the Thom class by a differential form on 
the total space of a vector bundle. Instead of 
having a compact support, the form has a nice 
Gaussian peak near the zero section and exponen- 
tially decays along the fiber directions. The pull- 
back of Mathai-Quillen's Thom form by any 
section is a representative of the Euler class. By 
scaling the section, one obtains an interpolation 
between the Pfaffian of the curvature, which 
distributes smoothly on the manifold, and the 
Poincaré dual of the zero set, which localizes on 
the latter. This elegant construction proves to be 
extremely useful in many situations, from the 
study of Morse theory, analytic torsion in mathe- 
matics to the understanding of topological (coho- 
mological) field theories in physics. 

In this article, we begin with the construction of 
Mathai-Quillen's Thom form. We also consider the 
case with group actions, with a review of equivar- 
iant cohomology and then Mathai-Quillen's con- 
struction in this setting. Next, we show that much of 
the above can be formulated as a “field theory’ on a 


superspace of one fermionic dimension. Finally, 
we present the interpretation of topological field 
theories using the Mathai-Quillen formalism. 


Mathai-Quillen's Construction 
Berezin Integral and Supertrace 


Let V be an oriented real vector space of dimension n 
with a volume element v € ^"V compatble with 
the orientation. The “Berezin integral" of a form 
w € A*V* on V, denoted by P w, is the pairing (v, w). 
Clearly, only the top degree component of w 
contributes. For example,if o € A?^V* is a 2-form, then 


g^?) 
Je E cm if m is even 
0. if n is odd 


If V has a Euclidean metric (- ,-), then v is chosen to 
be of unit norm. If © € End(V) is skew-symmetric, 
then (1/2)(-, X -) is a 2-form and, if n is even, the 
Pfaffian of X is 


B 
P) = | exp(5 6.93) 


The Berezin integral can be defined on elements in 
a graded tensor product A*V* & A, where A is any 
Z»5-graded commutative algebra. For example, if we 
consider the identity operator x —idy as a V-valued 
function on V, then dx is a 1-form on V valued in V, 
and (dx,-) is a 1-form valued in V*. Let [ei,...,e,] 
be an orthonormal basis of V and write x—x'e;, 
where x’ are the coordinate functions on V. We let 


i n(n--1)/2 pB 1 | 
u(x)— —_, | exp(—5 (a — (dx, ») 


The integrand is in Q*(V)@ A* V*. The result is 
u(x)= 


1 1 n 


a Gaussian n-form whose (usual) integration on V is 1. 

Let Cl(V) be the Clifford algebra of V. For any 
orthonormal basis {e;}, let y’ be the corresponding 
generators of Cl(V) and let 4 — ej; & y’ € V & CIV). 
For any w € A‘ V*, we have 


w(o,.... y) ji dium" vein" e CHV) 
If n is even, the Clifford algebra has a unique 
Z-graded irreducible spinor representation S$(V) — 
S*(V)c&S (V). For any element a € CI(V), the 


supertrace is stra = trs«(yj à — trs-(y) a. If X € End(V) 
is skew-symmetric, then 


str ow( (y, 2j) = A(x) PECS) 
where 


^ 3/2 
A(X) = det| ————— 

(>) esi) 
More generally, supertrace can be defined on 
CI(V) & A for any Za5-graded commutative algebra 
A—A*GA . If X is skew-symmetric and o € V* & 
A^, then 


"x E 


- Amy? f epZE) +a) [2] 


Representatives of the Euler and Thom Classes 


Let M be a smooth manifold and let 7: E —^ M 
be an oriented real vector bundle of rank r. Suppose 
E has a Euclidean structure (-,-) and V is a 
compatible connection. The curvature R e Q? 
(M, End(E)) is skew-symmetric, and hence (-,R-) € 
Q^(M, ^? E*). A de Rham representative of the Euler 
class of E is 


aom] (eR) = n(5) 3) 


Here, the Berezin integration is fiberwise in E: it is 
the pairing between the integrand and the unit 
section v of the trivial line bundle A’E that is 
consistent with the orientation of E. The de Rham 
cohomology class of [3] is independent of the choice 
of («.«J'or V, 

Let s be a section of E. Following Berline et al. 
(1992) and Zhang (2001), we consider 


Sy, s= $ (s.s) ~ (Vs,-)+3(-,R-) [4] 


ey(E)— 


a differential form on M valued in A*E*. Mathai- 
Quillen's representative of the Euler class is 


=f r(r--1)/2 pB 
ev E) 5 fe s 


One can show that ey,,(E) is closed and that as 
B varies, the cohomology class of ev. 4,(E) does not 
change. By taking 8 — 0, the de Rham class of 
ev. .(E) is equal to that of ey(E) when r is even. The 
form ev, 4(E) provides a continuous interpolation 
between [3] and the limit as 9 — oc, when the form 
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Is concentrated on the zero locus of the section s. In 
fact, the Euler class is the Poincaré dual to the 
homology class represented by s'(0). Hence, if 
n > m and if w € ('"^"(M) is closed, we have 


J w ^ ey, (E) = Low” [6] 


when s intersects the zero section transversely. 

To obtain Mathai-Quillen's representative of the 
Thom class, we consider the pullback of E to E itself. 
The bundle z*E — E has a tautological section x. 
Applying [5] to this setting, we get 


_4yrr+)/2 pB 1 
mE | ex (- zm 


-(Vx.) - 5.8) 7 


where (-,-), V, and R are understood to be the 
pullbacks to 7*E. This is a closed form on the total 
space of E. Moreover, its restriction to each fiber 
is the Gaussian form [1]. The cohomology groups 
of differential forms with exponential decay along 
the fibers are isomorphic to those with compact 
vertical support or the relative cohomology groups 
H*(E, EV M). Here M is identified with its image 
under the inclusion i: M — E by the zero section. 
Under the above isomorphism, the cohomology 
class represented by ty(E) coincides with the 
Thom class 7(E) 2 i,1 € H'(E, EMM) defined topo- 
logically. For any section s€I\(E), we have 
ev, (E) = s'y (E). 


Character Form of the Thom Class in K-Theory 


Let E— E* ®E be a Z5-graded vector bundle over 
M. The spaces Q*(M, E), T(End(E)) and Q*(M) T 
(End(E)) are also Z2-graded. The action of a & T € 
Q*(M) & T(End(E)) on 8 & s € Q'(M,E) is 


aG T:8&se (—1)l Pto A B) & (Ts) 


The supertrace of A € l'(End(E)) is str A =tre+A — 
tre-A; it extends Q*(M)-linearly to str: Q*(M) ŠT 
(End(E)) —^ O*(M). Let V be a connection on E 
preserving the grading. V is an odd operator on 
Q*(M, E). If L € I(End(E) ) is odd, then D=V + L 
is called a *superconnection" on E; the “curvature” 
D? =R + VL + L? € (Q*(M) & T(End(E)))* is even. 
With the superconnection, the Chern character of 
the virtual vector bundle E* © E^ can be repre- 
sented by 


chy ;(E", E. )- sep ( p?) [8] 
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It is a closed form on M and its de Rham 
cohomology class is independent of the choice of 
V or L. If L is invertible everywhere on M and the 
eigenvalues of V—1L? are negative, then [8] is exact: 


chy, a a 
g" “afol = (V + BL) ID d8 


Now let E be an oriented real vector bundle of 
rank r — 271 over M with a Euclidean structure (-, -). 
Suppose further that E has a spin structure. The 
associated spinor bundle S(E) - S'(E) G S (E) is a 
graded complex vector bundle over M. For any 
section s € [ (E), let c(s) € F'(End(E) ) be the Clifford 
multiplication on E. Then for any s,s’ € T(E), 
we have {c(s), c(s’)} = —2(s, s’). Given a connection V 
on E preserving (-,-), the induced spinor connection 
V5 on S(E) preserves the grading. If R is the curvature 
of V, that of V? is R = —(1/4) (y, Ry), where y is now 
a section of E & CI(E). For any s € T(E), consider the 
superconnection 


- 1/2 
D, - v5 (7) c(s) 


The Chern character form [8] of S^ (E) GS (E) is, 
using [2], 

RN 
ches 5,5) = (D"A(Ž) ev.cE) 19 


where ey,s(E) is given by [5]. In cohomology groups, 
[9] reduces to 


ch(S*(E)) — ch(S-(E)) = (-1)"A(E) "*e(E) 


If M is noncompact and the norm of s increases 
rapidly away from s^! (0), then both sides of [9] are 
differential forms that decay rapidly away from 
s 1(0) and can represent cohomology classes of such. 
As before, we take the pullback z*E with the 
tautological section x. Then [9] becomes 


che (z* S* (E),«* S (E)) 


M/RMP 
-cu"wá(i) mw) uo 
where Ty(E) is given by [7]. Both sides of [10] are 
forms on E that decays exponentially in the fiber 
directions; hence, it descends to an equality in 
H*(E,EMM). In the relative K-group K(E,E\M), 
the pair z*S*(E) with the isomorphism c(x) away 
form the zero section is, up to a factor of (— 1)", the 
K-theoretic Thom class 41 € K(E, EV M). Therefore, 
[10] reduces to the well-known formula 


ch(i1) = a^A(E) !"?^i,1 


in cohomology groups H*(E, EV M). The refinement 
[10] as an equality of differential forms is 
due to Mathai and Quillen (1986). In fact, this is 
how [7] was derived originally. 


Equivariant Cohomology and Equivariant 
Vector Bundles 


Equivariant Cohomology 


Let G be a compact Lie group with Lie algebra aq. 
Fixing a basis {e,} of q, the structure constants are 
given by [e,, ep] = ti pee: Let {v7} and {yf} be the dual 
bases of q* generating thé exterior algebra A(q*) and 
the symmetric algebra S(aq*), respectively. The Weil 
algebra is W(q) = ^ (a*) È S(a*). We define a grading 
on W(a) by specifying deg 0^ — 1, deg? —2. The 
contraction z, and the exterior derivative d are two 
odd derivations on W(q) defined by 


ui? = . 
dý! = m 0 + Te , do^ = —t iP or 


The Lie derivative is La= {ta,d}. These operators 
satisfy the usual (anti-)commutation relations 


L,- fud [Lad 0 [12 


l6; Ly} = 0, 
|La, Ly| = = table 


[La, 7A = t poc; [13] 


The cohomology of (W(a), d) is trivial. 

If G acts smoothly on a manifold M on the left, let 
V, be the vector field generated by the Lie algebra 
element —e; € 8. Then, [V;,,V,]—:*,V.. Denote 
(4, — Ly, and L,=Ly,, acting on Q*(M). In the Weil 
model of equivariant cohomology, one considers the 
graded tensor product W(a) & Q*(M), on which the 
operators 


ba = La G9 1 E 1 & La 
d=d®1+1d 
fy = £061 + Lek, 


act and satisfy the same relations [12] and [13]. 
An element we W(q)@*(M) is “basic” if it 
satisfies t27=0,L,w=0 for all indices a. Let 
Oc(M)-(W(gq)é» Q*(M))g, be the set of such. 
Elements of Q%(M) are equivariant differential 
forms on M. The operator d preserves {27.(M) 
and its cohomology groups H&(M) are the equiv- 
ariant cohomology groups of M. They are 


isomorphic to the singular cohomology groups of 
EG xg M with real coefficients. 

The BRST model of Kalkman (1993) is obtained 
by applying an isomorphism e —e"'*?^ of W(q)@ 
O*(M). The operators become 


g0i,00 ! —1401 
codoo!—-d-q' Giu + eL, 
colLl,oc '=L, 


The subspace of basic forms in the Weil model 
becomes 


c (Q5 (M)) = (S(a*) & Q* (M) 


This is precisely the Cartan model of equivariant 
cohomology, in which the exterior differential is 


d-19d-w eu 


If P is a principal G-bundle over a base space B, 
we can form an associated bundle P xg M — B. 
Choose a connection on P and let O —O9^e, € 
Q'(P) S q, P= Pe, c N7(P)@q be the connection 
and curvature forms, respectively. The components 
O", ^ satisfy the same relations [11]. Replacing 
Vp? by 07,7, we have a homomorphism that 
maps w € W(q) & Q*(M) to à € Q*(P x M). If w is 
basic, then so is W, and the latter descends to a form 
w on P xg M. Furthermore, the operator d on Q% (M) 
descends to d on Q*(P xc; M). Thus, we get the 
Chern—Weil homomorphisms (Q¢.(M) — Q*(P xc M) 
and Hc(M) — H*(P xc M). For example, the vector 
space R' has an obvious SO(r) action. The Gaussian 
r-form [1] is invariant under SO(r) and can be 
extended to an SO(r)equivariant closed r-form, 
called the *universal Thom form." Let E be an 
orientable real vector bundle E of rank r with a 
Euclidean structure. E determines a principal SO(r)- 
bundle P; the associated bundle P x50, R” is E itself. 
By applying the Chern-Weil homomorphism to this 
setting, we get a closed r-form on E. This is another 
construction of the Thom form [7] by Mathai and 
Quillen (1986). Further information of equivariant 
cohomology can be found there, and in Berline et al. 
(1992) and Guillemin and Sternberg (1999). 


Equivariant Vector Bundles 


Recall that a connection on a vector bundle E — M 
determines, for any k > 0, a differential operator 
V : Q*(M, E) — Q**! (M, E) 


The curvature R = V € 07(M,End(E)) satisfies the 
Bianchi identity VR = 0. If the connection preserves a 
Euclidean structure on E, then R is skew-symmetric. 
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If a Lie group G acts on M and the action can be 
lifted to E, then G also acts on the spaces T(E) and 
O*(M, E). As before, the Lie derivatives L, on these 
spaces are the infinitesimal actions of —e, € q. We 
choose a G-invariant connection on E. The 
“moment” of the connection V under the G-action 
IS Jj, — L,; — Vy, acting on T(E). In fact, jj is a 
section of End(E) or jy, c€I(End(E) $q'. If a 
Euclidean structure on E is preserved by both the 
connection and the G-action, then pz is skew- 
symmetric. On 2*(M, E), we have 


big = the, EAM 
bat = Vily, Laltp = thy le 
Mas Hbl] = tablic + Rab 


where Rap = R( Va, Vp) € l'(End(E)). 

On the graded tensor product W(q) & Q*(M, E), 
the contraction 7, and the Lie derivative L, act and 
satisfy [13]. In the Weil model, equivariant differ- 
ential forms on M with values in E are the basic 
elements in W(q) & Q*(M, E), which form a subspace 

c(M,E) 2 (W(q) & Q'(M,E),,. The “equivariant 
covariant derivative" is 


V=d1+18V +9 È pa [14] 


One checks that (/,, V} =La and hence V preserves 
the basic subspace Q% (M, E). The equivariant curva- 
ture R=V is 


R-R-JV Vu, cqui 9 ^R, [1| 


It satisfies the equivariant Bianchi identity VR — 0. 
Equivariant characteristic forms are invariant poly- 
nomials of R. They are equivariantly closed and 
their equivariant cohomology classes do not depend 
on the choice of the G-invariant connection. Hence, 
they represent the equivariant characteristic classes 
of E in Hý (M). 

For the BRST model, we use a similar isomorph- 
ism c —e" ?^ on W(a)c Q*(M, E). The operators 
become 


coií;00 ! —1,01 
ooVoo! =V =° Gu +9 SL, 
go f og | = L. 
and the basic subspace turns into 
c (Q5 (M, E)) = (S(g*) & Q* (M, E)" 


This is the Cartan model, which can be found in 
Berline et al. (1992). The equivariant covariant 
derivative is 


V-19V-weo Gu 
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The equivariant curvature is R’ = iV)=R+ TP 
and the characteristic forms are defined similarly. 

Let P— B be a principal G-bundle with a 
connection ©. Following [14], the bundle P x E —^ 
P x M has a connection 


V=d@14+18V+0' @p, 


It descends to a connection V on the vector bundle 
P xg E 5 P xq M. The map V+ V can be consid- 
ered as the analog of the Chern- Weil homomorphism 
for connections. There is also a homomorphism 
Q¢(M, E) > 2*(P x; M,P x; E), which commutes 
"i the covariant derivatives V, V. The curvature 
R — V^ is the image of the equivariant curvature R. 
Consequently, the equivariant characteristic forms 
descend to those of P xg E — P xq M by the usual 
Chern-Weil homomorphism. 

Now let E=E* @E be a graded vector bundle 
over M with a G-action preserving all the structures. 
We have the 2%(M)-linear supertrace map str: 
Oc (M) & T(End(E)) + Q% (M). If V is a G-invariant 
connection on E preserving the grading and if 
L € I(End(E) )V is odd and G-invariant, then 
D=V+L is an “equivariant superconnection." 
The equivariant counterpart of [8] is 


èha LUE" EL) = strexp b) € QG(M) 
| T 


representing the equivariant Chern character of 
E'GE in Hg&(M). 


Representatives of the Equivariant Euler 
and Thom Classes 


Consider an oriented real vector bundle E — M of 
rank r with a Euclidean structure (-,-). Choose a 
connection V on E preserving (-,-). We assume that 
a Lie group G acts on M and that the action can be 
lifted to E preserving all the structures on E. We 
use the Weil model; the constructions in the Cartan 
model are similar. For any a € OE(M,E) and 
Be€QL(M,E), we obtain (a,A8) € QEM) by 
taking the wedge product of forms as well as the 
pairing in E. The Berezin ir ofw € QG(M, ^* E*) 
along the fibers of E is P w —(v,w)e Qr "(M). Here, 
v is the unit section of the ai aiis trivial 
determinant line bundle A"E, compatible with the 
orientation of E. The equivariant Euler form 


es(E) = zm] eie ) = (5) [16] 


is equivariantly closed. It represents the equivariant 
Euler class ec(E) € H&(M). 


Given a G-invariant section s € I'(E)*, the equiv- 


ariant counterpart of [4] is 


$5,—3(5,5) + 
and that of Mathai-Quillen's Euler form [5] is 


(fpr B v 
ec (E) um] Mao 18] 


(Üs, J+A R) (17 


It is also equivariantly closed, and its equivariant 
cohomology class is eg(E). The equivariant exten- 
sion of Mathai-Quillen's Thom form [7] is 


yum B 1 
Tg (E) = m] exp( 5 6) 


-Fx ^ 56 R3) ID 


where x is the (G-invariant) tautological section of 
TE — E. 

Finally, G acts on the (graded) spinor bundle S(E). 
Using the equivariant superconnection 


Dy s (=) me) 


[9] generalizes to 


~1/2 
chy ,(S*(E), $- (E)) = (-1)"À x] eg (E) 


Now apply the construction to the bundle 7*E — E 
and its tautological section x. The pair 7*S*(E) with 
an odd bundle map c(x) determines, up to a factor 
of (—1)", the Thom class i15 in the equivariant 
K-group KgG(E, EMM). The equivariant analog of 
[10] descends to 


cha (ilg) = n'Ag(E) M44 des 


in equivariant cohomology. 


Superspace Formulation 


Mathai-Quillen Formalism and the 
Superspace R?'! 


Let R?!! be the superspace with one fermionic 
coordinate 0 but no bosonic coordinates. The 
translation on R°!! is generated by D-—0/090, 
which satisfies (D, D) ^ 0. We consider a sigma 
model on R°!! whose target space is an (ordinary) 
smooth manifold M of dimension n. A map 
X: R?!! — M can be written as X(0) 2 x + V 10v. 
Here, x = X|,_y € Mand p= —V-1DX|,_,) € TM; 
the latter is fermionic. Under the translation 


O= +e, x and v vary according to the super- 
symmetry transformations 


6x = cDX|, y = V—lew 20) 
dy) = eD(DX)|, y = 0 

Clearly, 6 = 0, which is also a consequence of D? = 0. 

For any p-form w € QP (M), we have an observable 


and 


p 


O, (x, v) -- (p 


Wi, -ip (x) tae i 

Using C(-) to denote the set of function(al)s on a 
space, we can identify C(Map(R?' 1 M)) with Q*(M). 
Under [20], 60,(X) = «O4,(X). So, O,(X) is invar- 
iant under supersymmetry if and only if w is closed. 
The cohomology of 6 is the de Rham cohomology of 
M. Consider the measure [dX] = [dx][dv]. In local 
coordinates, [dx] = dx! --- dx" is the standard (boso- 
nic) measure and [dv] 2 dv! --- di" is a fermionic 
measure such that 


/ [du (—1)""7D/2 pl... up — 1 


For any we€E"(M), the superfield integral 
f [dX]O,(X) is equal to the usual integral fyw if 
the latter exists. 

Let E — M be a real vector bundle of rank r with 
an inner product (-,-), and let V be a compatible 
connection whose curvature is R. Consider a theory 
whose fields are X € Map(R?'!, M) and a fermionic 
section E € I(X*E). Let D=(X*V)p be the covar- 
iant derivative along D in the pullback bundle 
X*'E— R°!', Then, x—-Z|, 4, € Ex is fermionic 
and f = DE|- € Ex is bosonic. 

Given a fixed section s € l'(E), we write a super- 
space action 


SmaQ[X, E| E dé(z,j D2 + v —1s o X) 


Re 
- (5f) + V-1(f,s) — (Vus. x) 
+3 (x, R(^ v)x) [21] 
It is automatically supersymmetric. Performing the 


Gaussian integral over f and replacing x by —/—ly, 
we get 
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| deese = Sy [lide pa 


where 
Smalx, V, x] 
lis4-—4—lx Ve ER D 


When r is even, [22] is equal to Oav, (Ej (X), where 
e(V,s)(E) is given by [5]. Furthermore, for any 
closed form w on M, the expectation value 


(0.00) = faxiasio.Q0e 9 — pa] 
is equal to [6]. 


Equivariant Cohomology and Gauged Sigma 
Model on R°”! 


Suppose G is a Lie group and P is a principal G-bundle 
over R?!!, Since @ is nilpotent, we can choose a 
“trivialization” of P such that the connection and 
curvature are A c Q'(R9' l)$g and Fe Q2(R9!!)g 
q, respectively. (q is the Lie algebra of G.) In 
components, c= V—lipA € q is fermionic and ¢= 
—(V—1/2)i2,F € q is bosonic. The space of connec- 
tions A is the set of pairs (c, o). Under 0 — 0 + e, 


óc — € ( + lc, a) 25] 
60 = V—lelc, d] 


Thus, the algebra C(A) is isomorphic to the Weil 
algebra W(q) and ó corresponds to the differential d 
in [11]. This relation between gauge theory on a 
fermionic space and the Weil algebra can be found 
in Blau and Thompson (1997). 

With a trivialization of P, the group of gauge 
transformation G can be identified with 
Map(R?!!, G). Any group element is of the form 
g—geV-19, with g—g|;..4 € G and £=/-lips* weg 
(fermionic), where w is the Maurer-Cartan form on 
G. The action of g is A—A'—Ad;(A—£'c), or 
cc —Ad,(c—£) and ¢++¢'=Ad,¢. By choosing 
£—c, we obtained a new trivialization, called the 
*Wess-Zumino gauge," in which c =0. The residual 
gauge redundancy is G, and A/G=q/Adg. The 
Wess-Zumino gauge is not preserved by the transla- 
tion on R°!!! unless we define 6’ by composing 6 with 
a suitable (infinitesimal) gauge transformation. If so, 
then 6’¢=0. 

Suppose M is a manifold with a left G-action. As 
before, let {e,} be a basis of q and let the vector field V, 
be the infinitesimal action of —e,. In the gauged sigma 
model, we include another field X € T(P x; M). With 
a trivialization of P, we can identify X with a map 


396 Mathai-Quillen Formalism 


4 R — M. The covariant derivative is given by 
—dX —A"V4,DX-VpX. Let x=Xh-g EM 
Mir Vy -—-—v-1DX|,.,€ T,M. Then the supersym- 


metric transformations are 
óx! = V —1e(u/ — c" Vi) 
by = —e(d* Vi + V 1d Vi) 
In the Wess-Zumino gauge, the transformations 
simplify to ó'x = /—lew, fy = — ed V4. 
The observables form the G-invariant part of the 


space C(A x Map(R?!!, M)). For any w € (?(M), 
we have 


|26] 


DX)|9—0 


= wing 27 
O,(X, A) is gauge covariant: O,(X, A) 5 Ogs.,(X, A), 
and the set of gauge-invariant observables is thus 
identified with (S(q*) x Q*(M))°. Moreover, since 


60,(X, A) = ((O4,(X, A) — V —1c^O0;, (X, A) 
——19*0, X, AY) 


6 corresponds to the differential d' in BRST model. 

Let E — M be an equivariant vector bundle and 
let V be a G-invariant connection with curvature R 
and moment jj. Any s € T(E)? defines a section of 
P xg E — P xg M, still denoted by s. Consider a 
theory with  superfields X €TI(P xg; M) and 
= € I(X'(P xg E)) (fermionic). Let D be the covar- 
lant derivative of the pullback connection. With a 
trivialization of P, we put x — |, y € Ex (fermio- 
nic) and f = DE|,.., € Ex (bosonic). The equivariant 
extension of [21] is 


Suo[X, &, A] = J. ,, d0(E,}DE + V. is o X) 


Similar to [22], we get, in the Wess-Zumino gauge, 


J meteo 7 y Ci fixe -Smo lx, yox] [28] 
where 
SMQ Ix, Vp, Q, x] 
"m - (s, s) = v —1(x, Vus) 
1 1 
— x»; Rib, w)x) m Y td "UaX) [29] 
4 2 


When r is even, [28] is equal to O;y (X, A), where 
e(V,s) is given by [18]. 


The Atiyah-Jeffrey Formula 


Given the G-action on M, for any x € M, there is a 
linear map C,:q — T,M defined by C,(e;) = V,(x). 
With an invariant inner product (-,-) on g and an 
invariant Riemannian metric on M, the adjoint of 
Cx is CL: TM — 9, that is, Cle OQ! (M) & g. If G 
acts on M freely, then C, is injective and (CC), is 
invertible for all x € M. The projection M — 
M — M/G is a principal G-bundle. It has a connection 
such that the horizontal subspace is the orthogonal 
compliment of the G-orbits. The connection 1-form is 
O — (C* C) ! Cl, whereas the curvature is 6 — (C! C) ! 
dC! on horizontal vectors. 

Let w be an equivariant form on M. Suppose G 
acts on M freely, then w descends to a form w on M. 
We look for a gauge-invariant, supersymmetric 
quantity Y(X, A) such that 


ds 
vol(G) 


= [4306.00 30] 


/ (dX][dA]O,,(X, AJ (X, A) 


Mathematically, Y corresponds to a closed equivar- 
iant form v on M such that 


sag; J,. Ie [9 ^ = f a 


which is [30] in the Wess-Zumino gauge. In fact, v is 
distribution valued in the sense of Kumar and Vergne 
(1993) and can be understood as an equivariant 
homology cycle, as in Austin and Braam (1995). 

Let P be a G-bundle over R?!! with a connection 
and let Ad P =P xg g > R!! be the adjoint bundle. 
Consider a (bosonic) superfield A € l'(AdP). Set A = 
Algo (bosonic) and n= —V-1DA|, , (fermionic). 
Choosing a trivialization of P, A and 7 are both in g. 
Under 0 — 0 + e, they transform as 


64 = V—1e(n + lc, A]) 


31 
én = e([d, AJ m v —1[e, rl) | | 


The superspace action 


Scmr[X, A, A] = V—1 d0(A, C'DX) 


Ro! 
is invariant under [25], [26], and [31] and, under the 
Wess—Zumino gauge, it is 
Score |x, V, Q, 1), A| 
= —V—1(n, Cy) — V—1(A, dC' (y, v) 
+ (A, C! CQ) [32] 


If G acts on M freely, then 
T(X,A) = J iie Soir [33] 


satisfies [30]. The factor Y(X, A) in [30] is called 
“projection” in Cordes et al. (1996). 

Let E — M be a G-equivariant vector bundle with 
a fixed G-invariant connection V, moment p, and 
an invariant section s. Consider the superspace 
action 


SA] X, 5, A, A] = SmaQ[X, =, A] + Scmr[X, A, A] 


In the Wess-Zumino gauge and after the Gaussian 
integral over f, it becomes the Atiyah—Jeffrey action 


SA] Ix, V, Q, X: 1); A] 
—- SMQ Ix, V, h, x] T SCMR Ix, V, D, ); AJ [34] 


If s intersect the zero section transversely and G acts 
on s ! (0) freely, then s! (0)/G is smooth and 


[a= [ éxitavitdeliéxlán]idN 
s-1(0)/G 
x O, (x, p, pje SNE texn [35] 


for any closed equivariant form w on M. Equation 
[35] is the formula of Atiyah and Jeffrey (1990) and 
of Witten (1988a) in an infinite-dimensional setting. 
When s~! (0)/G is not smooth, the right-hand side of 
[35] can be regarded as a definition of the left-hand 
side. 

It is often convenient to add to Say; another term 


AS[X, A, A] = -;/ (LAF, A], DA) 
RO! 


= VL lo, mn) +AA A) [4 


Since [36] is ó-exact and no new field is added, the 
integral [35] does not change if AS is added to Saj. 


Applications to Cohomological 
Field Theories 


We now apply the Mathai-Quillen construction 
formally to a number of cases in which both the 
rank of the vector bundle and the dimension of the 
base space are infinite. Thus, the (bosonic and 
fermionic) integrals in [24] or [35] become path 
integrals in quantum mechanics or quantum field 
theory. 


Supersymmetric Quantum Mechanics 


Let (M,g) be a Riemannian manifold and LM = 
Map(S', M), the loop space. At each point u € LM, 
which is a map 4:5! — M, the tangent space is 
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T,LM =T (u*TM]). In particular, ù= du/dt, where t 
is a parameter on $!, is a tangent vector at 4 and 
uü is a vector field on LM. For any Morse 
function h on M, s(u) — zt + (grad b) ou is another 
vector field on LM. 

Vector fields on LM can be identified as sections of 
the bundle ev*TM — S! x LM, where ev:S! x 
LM — M is the evaluation map. The Levi-Civita 
connection V on TM pulls back to a connection on 
ev* TM and the covariant derivatives along LM define 
a natural connection V^ on T(LM). For example, 
for any tangent vector V € T,LM=I(u*TM), we 
have ViMs(u) — V*V + (Vy grad b) ou, where V is 
the pullback connection on u*TM. The Riemann 
curvature tensor R on M determines that on LM. 

The (infinite-dimensional) analog of [22] is 


[ aestas (— f art) — o7 
where w, y € T,LM —T(u* TM) are fermionic and 
Liu, v, x] - 5 g(à + grad b, à + grad b) 
— v —-lg(x, Viv + Vy grad h) 
— 48(x, R(%, v)x) [38] 


Here and below, factors of V—1 and 27 in [22] are 
absorbed in the path-integral measure. [38] is, up to 
a total derivative, the Lagrangian of the Euclidean 
N —2 supersymmetric quantum mechanics on M. 
The partition function [37] is equal to Euler 
characteristic number of LM or M, which can 
be confirmed by an (exact) stationary-phase 
calculation. 


Topological Sigma Model 


Let © be a Riemann surface with complex structure 
e and let (M,w) be a symplectic manifold with a 
compatible almost-complex structure J. Let E be a 
vector bundle over Map(X, M) so that the fiber over 
4 is €, —l'(w' TM x T'X). For any 4 € Map(X, M), 
du € £, and u> du is a section of E. The pullback 
of the Levi-Civita connection on TM, tensored with 
a connection on T*», defines a connection on £. 

The vector bundle to which we apply the Mathai- 
Quillen formalism is the antiholomorphic part E£” of E. 
The fiber over u € Map(X, M) is E”! 2T((u* TM & 
T*X9! ). The sub-bundle £9! has a connection V! via 
projection from €.€°' has a natural section 
s:u Ou — (1/2)(du +Joduoe). Solutions to the 
equation Ój4 —0 are  pseudoholomorphic (or 
J-holomorphic) curves; let M — s^! (0) be the space of 
such curves. Its (virtual) dimension is 


dim M -—ix(X)dimM + 2cı(u*TM) — [39] 
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Along any V € T, Map(X, M) =T (u* TM), the covar- 
iant derivative of s — Ój is calculated in Wu (1995): 


Vy (8j) 2 1(V*V 4] o V"V oe) 
+4VyJ o (duo €+ J o du) |40] 


where V" is the pullback connection on u* TM. 

To write the Mathai-Quillen formalism for the 
bundle €°' — Map(X, M), we let v € l'(u* TM) and 
x € (4 TM & T*D)"') be fermionic fields. Equa- 
tion [23] becomes the Lagrangian 


1 dar] 4- 3(du, J o du o €) 
— V/—1(x, V4 + (VyJ) © 
- t(x, (R, Y) — 1(V4J)*)x) [41] 


It is precisely the Lagrangian of the topological 
sigma model of Witten (1988b). Here, the pairing 
(-,-) is induced by the Riemannian metric w(-,J-) 
on M and a metric on X that is compatible with € 
The second term in [41], integrated over X, is equal 
to f. u*w= ([w], u [X]). 

For any differential form «o € €P(M), let O,(u, y) 
be the observable obtained from ev*a € O^(X x 
Map(X,M)) by identifying O'(Map(X, M)) with 
C(Map(R°!', Map(X, M))). If œ is closed and 
y € H,(X) is a homology cycle, then W, (4, Y) = 
i O.(u,~) is identified with a closed (p — q)-form 
on Map(X,M). For closed a; €Q?(M) and 
y; € H,(X)(1 € i € r), the expectation values 


ii 
£1 


= [awiévitóx] T] Wonder Seen! 42) 
iI 


L[u, v», x] = 


o du o£) 


are the Gromov- Witten invariants of (M, w). More- 
over, [42] is nonzero only if 57. , (p; — qi) = dim M. 


Topological Gauge Theory 


Let M be a compact, oriented 4-manifold, G, a 
compact, semisimple Lie group, and P — M, a 
principal G-bundle. Denote by .A the space of 
connections on P and G, the group of gauge 
transformations. The Lie algebra of G is Lie(G) = 
l'(ad P) 2 Q9 (M,ad P). At A € A, the tangent space is 
T4.A — Q! (M, ad P). Both spaces have inner products 
if we choose an invariant inner product (- , -) on the Lie 
algebra q of G and a Riemannian metric g on M. The 
infinitesimal action of G on A is C=V4: 
Lie(G) -—À TAA. 

With a Riemannian metric, any 2-form on M 
decomposes into self-dual and anti-self-dual parts: 
Q^(M)—Q2(M)& Q? (M). We consider a trivial 
vector bundle € — A whose fiber is Q7 (M, ad P). 


G acts on € and the bundle is G-equivariant. The 
trivial connection on £ is G-invariant; the moment is 
given by ó € l'(ad P):x € 22 (M,ad P) — [ó, x]. The 
bundle € has a natural section s: A € A= F}, the 
self-dual part of the curvature. Its derivative along 
VeQO'(M,ad P) - TAA is Lys-(VAV)'. The sec- 
tion s is G-invariant, the zero set s^! (0) is the space 
of anti-self-dual connections, and the quotient 
M -—s(0)/G is the instanton moduli space. Its 
(virtual) dimension is 


dim M = 4b(a)k(P)— 1 dim G(x(M) + o(M)) 


where b(a) is the dual Coxeter number of q and 


1 m 
EP) = "aj Pf (AdP), [M]) € Z 


is the instanton number of P. 

We proceed with the Mathai-Quillen interpretation 
of Atiyah and Jeffrey (1990). Let y € Q! (M,ad P), 
x € 05 (M, ad P), 1 € T (ad P) be fermionic fields and 
$,À € l'(ad P), bosonic fields. The combination of 
[34] and [36] is given by the Lagrangian 


LIA, v, $, x, 1. 4] 
= 3 ERI? + (6, V4 VAA) 
- V—1(n, Vay) — v—1(x. VAV) 
— v —1(A, [v v/]) 


ES 


=A 1 " 
(6. bc x] + [m nl) - 7 Ile Al [43] 


Here, (-,-) is the pairing induced by a Riemannian 
metric on M and an invariant inner product on Q. 
With an additional topological term proportional to 
(FA, ^ Fa), [43] is the Lagrangian of topological 
gauge theory of Witten (19882). 

There is a tautological connection on the 
G-bundle A x P — A x M. It is invariant under the 
G-action. Identifying Q*(.A) with C(Map(R°'',.A)) 
and using the Cartan model, the G-equivariant 
curvature is F = F4 + /—1uw + ó. For any homology 
cycle y € H,(M), 


W. (A, Y, 0) = 


1 
abt) Je AFJ |44] 


corresponds to a closed G-equivariant form on A. 
For y; € H4,,(M)(1 € i € r), the expectation values 


(i w, = «aug; | [#Allevlldelal lan an 


; 
X lI W..(A, i, de SA exa 45] 


i=] 
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are, up to a factor of |Z(G)|, Donaldson invariants 
of M. Moreover, [45] is nonzero only if 
3 oie1(4— 4) = dim M. 

Other cohomological field theories can also be 
understood or constructed by the Mathai-Quillen 
formalism. Of such we mention only the topological 
field theories of abelian and nonabelian monopoles 
in Labastida and Marino (1995), which are related 
to the Seiberg-Witten invariants. 


See also: Characteristic Classes; Donaldson-Witten 
Theory; Equivariant Cohomology and the Cartan Model; 
K-Theory; Topological Quantum Field Theory: Overview; 
Topological Sigma Models. 
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Fundamental Concepts of the 
Topological Theory of Knots and Links 


The first known discovery relating to knots as 
mathematical objects was made by Gauss around 
1833 in a note that refers to the knotting together of 
closed curves. This investigation originated in his work 
on electromagnetic theory that led him to compute 
inductance in a system of two linked circular wires. In 
this note he had given an analytic formula for the 
linking number of a pair of knotted curves. This 
number is a combinatorial topological invariant (it is 
an integer number). Moreover, one can now show that 
this number is invariant under Reidemeister moves 
(discussed in a later section). The linking coefficient 
can be generalized for the case of p- and q-dimensional 
manifolds in R?*?*! , The formula for the parametrized 
curves ^j (f) and 4(t) with radius vectors r1(t), r2(t) is 
given by the following formula: 


— r2, dri, d 
ikin, - -_//* 3a fi —f2,0f1, riy 1] 
yl Jy 


Is = 73] 
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The linking coefficient allows us to distinguish some 
two component links. Another approach to the link 
coefficient is that involving Seifert surfaces. (On this 
subject, see the section “Isotopies, Reidemeister 
moves, torus knots, and the linking number.") 

A systematic study of knots in R^, however, was 
only begun in the second half of the nineteenth 
century by Tait and his followers. They were 
motivated by Kelvin’s theory of atoms modeled on 
knotted vortex tubes of ether. It was expected that 
physical and chemical properties of various atoms 
could be expressed in terms of properties of knots 
such as the knot invariants. Even though Kelvin’s 
theory did not work, the theory of knots grew as a 
subfield of combinatorial and algebraic topology. 
Recently, new invariants of knots have been 
discovered and they have led to the solution of 
long-standing problems in knot theory. Surprising 
connections between the theory of knots and 
statistical mechanics, quantum groups, and quantum 
field theory are emerging. Moreover, knot theory 
has been shown to be intimately connected with 
many problems in physics, chemistry, and biology. 

Tait classified the knots in terms of the crossing 
number of a regular projection. A regular projection 
of a knot on a plane is an orthogonal projection of 
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the knot such that, at any crossing in the projection, 
exactly two strands intersect transversely. He made 
a number of observations about some general 
properties of knots which have come to be known 
as the *Tait conjectures." In its simplest form, the 
classification problem for knots can be stated as 
follows. Given a projection of a knot, is it possible 
to decide in finitely many steps if it is equivalent to 
an unknot. This question was answered affirma- 
tively by W Haken in 1961. (For details, see Burde 
and Zieschang (1985)). 


General Notions and Definitions 


Let M be a closed orientable 3-manifold. A smooth 
embedding of S! in M is called a knot in M. A link 
in M is a finite collection of disjoint knots. The 
number of disjoint knots in a link is called the 
number of components of the link. Thus, a knot can 
be considered as a link with one component. Two 
links L, L' in M are said to be equivalent if there 
exists a smooth orientation-preserving automorph- 
ism f:M— M such that f(L)=L’. For links with 
two or more components, we require f to preserve a 
fixed given ordering of the components. Such a 
function f is called an ambient isotopy and L and L’ 
are called ambient isotopic. Here, we shall take M to 
be S? 2 R? U {oo} and simply write “a link" instead 
of “a link in $°.” The diagrams of links are drawn as 
links in R^. A link diagram of L is a plane projection 
with crossings marked as over or under. The 
simplest combinatorial invariant of a knot K is the 
crossing number c(K). It is defined as the minimum 
number of crossings in any projection of the knot K. 
The classification of knots up to crossing number 17 
is now known. The crossing numbers of some 
special families of knots are known; however, the 
question of finding the crossing number of an 
arbitrary knot is still unanswered. Another combi- 
natorial invariant of a knot K that is easy to define is 
the unknotting number z(K). It is defined as the 
minimum number of crossing changes in any 
projection of the knot K which makes it into a 
projection of the unknot. Upper and lower bounds 
for u(K) are known for any knot K. An explicit 
formula for u(K) for a family of knots called torus 
knots, conjectured by Milnor nearly 40 years ago, 
has been proved recently by a number of different 
methods. The 3-manifold S?^XK is called the knot 
complement of K. The fundamental group m (S° V K) 
of the knot complement is an invariant of the knot 
K. It is called the fundamental group of the knot and 
is denoted by 7,(K). Equivalent knots have homeo- 
morphic complements and conversely. However, 


this result does not extend to links. (For details 
and a proof, see Manturov (2004), chapter 4). 


The Fundamental Group of Knots and 
Its Role in Topology 


For a better understanding of the above consider- 
ations, we need to introduce briefly the important 
concept of fundamental group in topology. The 
fundamental group plays an essential role in 
topology; it is involved in the entire technical 
apparatus of the subject, and likewise in all 
applications of topological methods. In fact, for 
low-dimensional manifolds (i.e., of dimension 2 or 
3) the fundamental group underlies all nontrivial 
topological facts. 

Classical knot theory is concerned with the space 
SV K—M, an open 3-manifold. There is a natural 
embedding of the torus T^ in M, namely as the 
boundary of small tubular neighborhood of the knot 
K. Similarly, for a link we obtain a disjoint union of 
2-tori in M. The principal topological invariant of a 
knot K is the fundamental group mı(M) of the 
complement M of K, with distinguished subgroup 
the natural image of z,(T^), T € M^, with the 
obvious standard basis. The classical theorem of 
Papakyriakopoulos of the 1950s asserts that a knot 
is equivalent to the trivial one if and only if z1(M) is 
abelian. It was known by Haken in the early 1960s 
that there is an algorithm for deciding whether or 
not any knot is equivalent to the trivial knot. 
However, while it appears to have been established 
(by Waldhausen and others in the 1960s and 1970s) 
that two knots are topologically equivalent if and 
only if the corresponding fundamental groups with 
labeled abelian subgroups are isomorphic, the 
existence of an appropriate algorithm for deciding 
such equivalence remains an open question. The 
complexity of the knot group 7;(M) has led to the 
search for more effectively computable invariants to 
distinguish knots and links. (On this subject, see the 
section “Polynomial invariants of knots and links.") 

Starting with the oriented diagram of the knot or 
link K on the plane, one calculates in the standard 
manner (see Crowell and Fox (1963) and Neuwirth 
(1965)) a presentation of the group 7;(M) of the 
knot (M—S?XK), obtaining one generator for the 
edge of the diagram of a trefoil knot and a pair of 
relations for each crossing. Since one relation of 
each such pair simply equates the pair of generators 
corresponding to the edges forming the upper 
branch of the crossing, the presentation reduces 
immediately to the standard one involving the same 
number of generators and relations. The 2-complex 


L with exactly one 0-cell, and with 1-cells labeled by 
generators and 2-cells labeled by the relations, is 
then a deformation retract of M. Lifting to the 
universal cover we obtain a boundary operator on a 
complex of free Z[7 |-modules, which takes the form 
of a square matrix with entries from this group ring, 
and it is this matrix that is related to some 
differentiation as follows. Denoting the generators by 
a; and relators by r;, one defines the operator 0,; by 


Oa; (aj) = 6i; 
a; (bc) = 9, (b) + b, (c) 
the matrix in question then has entries qj given by 
ij = Os,(rj) 


Mapping each generator a; to t, we obtain a 
complex of modules over the ring of integer Laurent 
polynomials, with boundary operator the corre- 
sponding square matrix now with Laurent poly- 
nomials as entries. The determinant of this matrix 
turns out to be zero, and the highest common factor 
of its cofactors, after multiplication by a suitable 
power of f£, turns out to be just the Alexander 
polynomials A(t). 

Let us say a bit more using a little different 
notation on this question. Let Ag(K) and J,(K) be 
the Alexander polynomial and the Jones polynomial, 
respectively. One of the earliest problems in knot 
theory was: to what extent does the topological type X 
of the complementary space X =S°\K and/or the 
isomorphism class G of its fundamental group 
G(K) — 4(X, xo) suffice to classify knots? The trefoil 
knot is the simplest example of nontrivial knot, so it 
seems remarkable that, not long after the discovery 
of the fundamental group of a topological space, 
Max Dehn (1914) succeeded in proving that the 
trefoil knot and its mirror image had isomorphic 
groups, but their knot types were distinct. Dehn's 
(ingenious) proof was the beginning of a long story, 
with many contributions which reduced repeatedly 
the number of distinct knot types that could have 
homeomorphic complements and/or isomorphic 
groups, until it was finally proved, quite recently, 
that (1) X determines K and (2) if K is prime, then G 
determines K up to unoriented equivalence. Thus, 
there are at most four distinct oriented prime knot 
types which have the same knot group. 

The knot group G is finitely presented; however, 
it is infinite, torsion-free, and (if K is not the unknot) 
nonabelian. Its isomorphism class is in general not 
easily understood via a direct attack on the problem. 
In such circumstances, the obvious thing to do is to 
pass to the abelianized group, but unfortunately 
G/[G, G] S H4(X; Z) is infinite cyclic for all knots, 
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so it is of no use in distinguishing knots. Passing to 
the covering space X that belongs to [G, G], we note 
that there is a natural action of the cyclic group 
G/[G,G] on ^X via covering translations. The 
action makes the homology group H4(^ X; Z) into a 
Z|q,q !|module, where q is the generator of 
G/[G,G]. This module turns out to be finitely 
generated. It is the famous Alexander module. While 
the ring Z[q,q !] is not a principal ideal domain 
(PID), relevant aspects of the theory of modules over 
a PID apply to H;(~X; Z). In particular, it splits as a 
direct sum of cyclic module, the first nontrivial one 
being Z[q4, q ! ]/ A;(K). Thus, Aj(K) is the generator 
of the *order ideal," and the smallest nontrivial 
torsion coefficient in the module H4(^X). In 
particular, A;(K) is very clearly an invariant of the 
knot group. 

We remark that when a knot is replaced by its 
mirror image (i.e., the orientation on $? is reversed), 
the Alexander and Jones polynomials A,(K) and 
J4(K) go over to A; 4(K) and /, 4(K), respectively. 
As noted earlier, Ag(K) is invariant under such a 
change, but from the simplest example, the trefoil 
knot, we see that J,(K) is not. Now recall that G 
does not change under changes in the orientation of 
S?. This simple argument shows that J,(K) cannot be 
a group invariant! Thus, it seems interesting indeed 
to ask about the underlying topology behind the 
Jones polynomial. 


Isotopies, Reidemeister Moves, Torus 
Knots, and the Linking Number 


Because each knot is a smooth embedding of S! in 
R^, it can be arbitrarily closely approximated by an 
embedding of a closed broken line in R?. Here we 
mean a good approximation such that after a very 
small smoothing (in the neighborhood of all ver- 
tices) we obtain a knot from the same isotopy class. 
However, generally this might not be the case. 


Definition 1 An embedding of a disjoint union of 
n closed broken lines in R? is called a polygonal 
n-component link. A polygonal knot is a polygonal 
one-component link. 


Definition 2 A link is called tame if it is isotopic to 
a polygonal link and wild otherwise. 


All C'-smooth knots are tame. In the sequel, all 
knots are taken to be smooth, hence, tame. 


Definition 3 Two polygonal links are isotopic if 
one of them can be transformed to the other by 
means of an iterated sequence of elementary 
isotopies and reverse transformations. The 
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elementary isotopy, generally, is assumed to be a 
replacement of an edge with two edges provided 
that the triangle has no intersection points with 
other edges of the link. 


It can be proved that the isotopy of smooth links 
corresponds to that of polygonal links; the proof is 
technically complicated. Like smooth links, poly- 
gonal links admit planar diagrams with overcross- 
ings and undercrossings, having such a diagram one 
can restore the link up to isotopy. 


Definition 4 By a planar isotopy of a smooth-link 
planar diagram we mean a diffeomorphism of the 
plane onto itself not changing the combinatorial 
structure of the diagram. 


Obviously, planar isotopy is an isotopy, that is, it 
does not change the link isotopy type in R*. 


Theorem 1 (Reidemeister) Two diagrams Dı and 
Dz of smooth links generate isotopic links if and 
only if Dı can be transformed into D5 by using a 
finite sequence of planar isotopy and tbe tbree 
Reidemeister moves Q1, Q2, Q3. 


Theorem 2 Suppose that D and D' are regular 
diagrams of two knots (or links) K and K', 
respectively. Then K x K' & D x D. 


We may conclude from the above theorems that 
the problem of equivalence of knots, in essence, is 
just a problem of the equivalence of regular 
diagrams. Therefore, a knot (or link) invariant may 
be thought of as a quantity that remains unchanged 
when we apply any one of the Reidemeister moves 
to a regular diagram. 

Knots and links embedded in R? can be consid- 
ered as curves (families of curves) in 2-surfaces, 
where the latter surfaces are standardly embedded in 
R^. In this section we shall briefly show that all 
knots and links can be obtained in this manner. 

Consider a handle surface S, standardly embedded 
in R? and a curve (knot) K in it. We can now ask the 
following question: which knot isotopy classes can 
appear for a fixed g? First, let us note that for g — 0 
there exists only one knot embeddable in $?, namely 
the unknot. The case g — 1 (torus, torus knots) gives 
us some interesting information. Consider the torus 
as a Cartesian product S! x S! with coordinates 
p,p € [0,22], where 27 is identified with 0. In two 
dimensions, the torus can be illustrated as a square 
with opposite sides identified. Let us embed this torus 
standardly in R?; more precisely, 


(9, p) — ((R + rcos p) cos ¢, 
(R + rcos o) sind, rsin y) [2] 


Here R is the outer radius of the torus, r the small 
radius (r< R), o the longitude, and the meridian. 
For the classification of torus knots we shall need 
the classification of isotopy classes of nonintersect- 
ing curves in T*: obviously, two curves isotopic in 
T? are isotopic in R?. Without loss of generality, we 
can assume the considered closed curve to pass 
through the point (0,0) — (27, 27). It can intersect 
the edges of the square several times. In addition, 
assume all these intersections to be transverse. Let us 
calculate separately the algebraic number of inter- 
sections with horizontal edges and those with 
vertical edges. Here, passing through the right edge 
or through the upper edge is said to be positive; that 
through the left or the lower edge is negative. Thus, 
for each curve of such type we obtain a pair of 
integer numbers. So, each torus knot passes p times 
the longitude of the torus, and g times its meridian, 
where GCD(p, 4) — 1. It is easy to see that for any 
coprime p and q such a curve exists: one can just 
take the geodesic line {qd — py — 0 (mod 27)}. Let us 
denote the torus knot by T(p,q). So, in order to 
classify torus knots, one should consider pairs of 
coprime numbers p,q and see which of them can be 
isotopic in the ambient space R?. The simplest case 
is when either p or g equals 1. The next simplest 
example of a pair of coprime numbers is p — 3, q— 2 
(or p — 2, q= 3). In each of these cases we obtain the 
trefoil knot. Let us state the following important 
result. 


Theorem 3 For any coprime integers p and q, the 
tori (p,q) and (q,p) are isotopic. 


Proof For a proof of this theorem, see Rolfsen 
(1990). Note that the (p,q) torus knot in one full 
torus is just the (q, p) torus knot in the other one. 
Thus, mapping one full torus to the other one, we 
obtain an isotopy of (p,q) and (q,p) torus knots. 
This homotopy of full tori can be expressed as a 
continuous process in $°. Indeed, torus knots of type 
(p,q) can be represented by a series of planar 
diagrams. Moreover, it is possible to demonstrate a 
way of coding a knot (link) as a (p-strand) braid 
closure. 

Analogously to the case of torus knots, one can 
define torus links which are links embedded into the 
torus standardly embedded in R^. We know the 
construction of torus knots. So, in order to draw a 
torus link, one should take a torus knot K D T (one 
can assume that it is represented by a straight linear 
curve defined by the equation g@ — pp = 0 (mod 27) 
and add to the torus T some closed nonintersecting 
simple curves; each curve should be nonintersecting 
and should not intersect K. Thus, these curves 
should be embedded in TK, that is, in the open 


cylinder. Each curve on the cylinder is either 
contractible or passes the longitude of the cylinder 
once. So, each curve in T\K is either contractible 
inside T\K, or “parallel” to K inside T, that is, 
isotopic to the curve given by the equation qó — 
po-—e(mod2z) inside T\K. Thus, the following 
theorem holds. 


Theorem 4 Each torus knot is isotopic to the 
disconnected sum of a trivial link and a link tbat is 
represented by a set of parallel torus knots of the 
same type (p,q). 


As we already know, a link invariant is a function 
defined on links that is invariant under isotopies. We 
shall represent links by using their planar diagrams. 
According to the Reidemeister theorem, in order to 
prove the invariance of some function on links, it is 
sufficient to check this invariance under the three 
Reidemeister moves. First, let us consider the 
simplest integer-valued invariant of two-component 
links. Let L be a link consisting of two oriented 
components A and B and let L’ be the planar 
diagram of L. Consider those crossings of the 
diagram L’ where the component A goes over the 
component B. There are two possible types of such 
crossings with respect to the orientation. For each 
positive crossing we assign the number (+1), for 
each negative crossing we assign the number (— 1). 
Let us summarize these numbers along all crossings 
where the component A goes over the component B. 
Thus, we obtain some integer number and, in fact, 
this number is invariant under Reidemeister moves. 
The so-obtained link invariant is called linking 
coefficient. 


Polynomial Invariants of Knots and Links 


By changing a link diagram at one crossing we can 
obtain three diagrams corresponding to links 
L,,L , and Lo which are identical except for this 
crossing. In the 1920s, Alexander gave an algorithm 
for computing a polynomial invariant A(t) 
(a Laurent polynomial in ft) of a knot K, called the 
Alexander polynomial, by using its projection on a 
plane. He also gave its topological interpretation as 
an annihilator of a certain cohomology module 
associated to the knot K. In the 1960s, Conway 
defined his polynomial invariant and gave its 
relation to the Alexander polynomial. This poly- 
nomial is called the Alexander-Conway polynomial. 
The Alexander-Conway polynomial of an oriented 
link L is denoted by V; (z) or simply by V(z) when L 
is fixed. We denote the corresponding polynomials 
of L}, L , and Lo by V+, V_, and Vo, respectively. 
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The Alexander-Conway polynomial is uniquely 
determined by the following axioms. 


Axiom 1 Let L and L’ be two oriented links which 
are ambient isotopic. Then 


Vp(z) = Vi(z) [3] 


Axiom 2 Let S’ be the standard unknotted circle 
embedded in S?. It is usually referred to as the 
unknot and is denoted by O. Then 


Vo(z) =1 [4] 


Axiom 3 The polynomial satisfies the following 
skein relation: 


V+) — V-(z) = zVo(z) [5] 


We note that the original Alexander polynomial 
Az is related to the Alexander-Conway polynomial 
of an oriented link L by the relation 


Ar(t) = Vi (t! ^ — t">) [6] 


In the 1980s, Jones discovered his polynomial 
invariant V; (t), called the Jones polynomial, while 
studying von Neumann algebras and gave its 
interpretation in terms of statistical mechanics. A 
state model for the Jones polynomial was then 
given by Kauffman (1987) using his bracket 
polynomial. These new polynomial invariants have 
led to the proofs of most of the Tait conjectures. 
The Jones polynomial Vx(t) of K is a Laurent 
polynomial in t, which is uniquely determined by a 
simple set of properties similar to the axioms for 
the Alexander-Conway polynomials. More gener- 
ally, the Jones polynomial can be defined for any 
oriented link L as a Laurent polynomial in t!?, so 
that reversing the orientation of all components of 
L leaves Vr, unchanged. In particular, Vg does not 
depend on the orientation of the knot K. For a 
fixed link, we denote the Jones polynomial simply 
by V. Recall that there are three standard ways to 
change a link diagram at a crossing point. The 
Jones polynomial is characterized by the following 
properties: 


1. Let L and L’ be two oriented links which are 
ambient isotopic. Then 


V(t) = Vi (f) 7 
2. Let O denote the unknot. Then 
Vo(t) =1 Lj 
3. The polynomial satisfies the following skein 
relation: 


FX. EV. ry [9] 
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An important property of the Jones polynomial that 
is not shared by the Alexander-Conway polynomial 
is its ability to distinguish between a knot and its 
mirror image. More precisely, we have the following 
result. Let K,, be the mirror image of the knot K. 
Then 


Vk, (£) = Vk(t — 1) [10] 


Since the Jones polynomial is not symmetric in t and 
t! it follows that in general 


Vr, (t) # Vx(t) [11] 


We note that a knot is called amphicheiral (achiral 
in biochemistry) if it is equivalent to its mirror 
image. We shall use the simpler biochemistry term. 
So, a knot that is not equivalent to its mirror image 
is called chiral. The condition expressed by [11] is 
sufficient but not necessary for chirality of a knot. 
The Jones polynomial did not resolve the following 
conjecture by Tait concerning chirality: if the cross- 
ing number of a knot is odd, then it is chiral. 
However, it has been demonstrated recently that a 
15-crossing knot provides a counterexample to the 
chirality conjecture. 


New Invariants and Their Applications 
in Mathematical Physics 


There was an interval of nearly 60 years between the 
discovery of the Alexander polynomial and the Jones 
polynomial. Since then a number of polynomials 
and other invariants of knots and links have been 
found. A particularly interesting one is the two- 
variable polynomial generalizing V, called the 
HOMFLY polynomial (name formed from the 
initials of authors of the article (Freyd et al. 1985) 
and denoted by P. The HOMFLY polynomial 
P(a,z) satisfies the following skein relation: 


gU P. — aP_ = Pp [12] 


Both the Jones polynomial V and the Alexander- 
Conway polynomial V; are special cases of the 
HOMELY polynomial. The precise relations are 
given by the following theorem. 


Theorem 5 Let L be an oriented link. Then tbe 
polynomials Pj, Vi, and Vj, satisfy the following 
relations: 


Vi (t) =P, (t, t1? — t12) and Vi(z) =P(1,z) [13] 


After defining his polynomial invariant, Jones also 
established the relation of some knot invariants with 
statistical mechanical models. Since then this has 
become a very active area of research. By 


constructing a typical statistical mechanics model - 
the star-triangle relations of the Yang—Baxter 
equations are an example of such model — one 
obtains a state model for the Alexander or the Jones 
polynomial of a knot, by associating to the knot a 
statistical system, whose partition function 


Zg :=  Ex(s)u(s) [14 


gives the corresponding polynomial. (For details, see 
Jones (1989)). In the function above, w = F(X, S) —^R 
is a weight function and the sum is taken over all 
states s € F(X, S). The energy E, of the system (X, S) 
is a functional, 


E,:F(X,S) R,keK [15] 


where the subscript k € K indicates the dependence 
of energy on the set K of auxiliary parameters, such 
as temperature, pressure, etc. 

However, these statistical models did not provide 
a geometrical or topological interpretation of the 
polynomial invariant. Such an interpretation was 
provided by Witten (1989) by applying ideas from 
quantum field theory to the Chern-Simons Lagran- 
gian. In fact, Witten's model allows us to consider 
the knot and link invariants in any compact 
3-manifold M. 


Vassiliev Invariants and the Space 
of All Knots: New Generalizations 
of Knot Theory 


An entirely new collection of knot invariants, 
which arose out of techniques pioneered by Arnold 
in singularity theory, has been introduced by V A 
Vassiliev in the 1990s. The knot invariants, like 
the Alexander polynomial, associate a knot with 
some sort of mathematical quantity. A Vassiliev 
invariant, on the other hand, is an invariant that 
satisfies a set of conditions. In this sense, all the 
invariants introduced above - the Jones polyno- 
mial, the HOMFLY and the Kauffman polyno- 
mial, the Conway polynomial, and the Alexander 
polynomial — can all be shown to be Vassiliev 
invariants. However, not all the knot invariants are 
Vassiliev invariants, for instance, the signature of a 
knot is not a Vassiliev invariant. The new Vassiliev 
invariants have a solid basis in a very interesting 
new topology, where one studies not a single knot, 
but a space of all knots. Vassiliev's knot invariants 
are rational numbers. They lie in vector space V; of 
dimension d;, i=1,2,3,..., with invariants in Vj 
having “order” i. These invariants are built from 
different families of crossing changes. 


Considering that Vassiliev's invariants require 
introducing an important conceptual change, shift- 
ing our attention from the knot K, which is the 
image of S! under an embedding $: 5$! — S?, to the 
embedding 6 itself. A knot type K thus becomes an 
equivalence class {ġ} of embeddings of S! into S?. 
The space of all such equivalence classes of embed- 
dings is disconnected, with a component for each 
smooth knot type. In this way, one passes from 
embeddings to smooth maps, thereby admitting 
maps which have various types of singularities. Let 
~M be the space of all smooth maps from S! to S?. 
This space is connected and contains all knot types. 
Our space will remain connected and will contain all 
knot types if we place two mild restrictions on our 
maps. Let M denote the collection of all ó € “M 
such that (S!) passes through a fixed point o and is 
tangent to a fixed direction at a. The space M has 
some interesting properties, the main one being that 
it can be approximated by certain affine spaces, and 
these affine spaces contain representatives of all 
knot types. The walls between distinct chambers in 
M constitute the discriminant X, that is, X= {¢ € 
M|¢} has a multiple point or a place where its 
derivative vanishes or other singularities. The space 
M — X is our space of all knots. 

The additive properties of the Alexander and 
Jones polynomials have a very attractive interpreta- 
tion in terms of Vassiliev invariants. By a result of 
Bar-Natan, all coefficients of the Alexander poly- 
nomial are Vassiliey invariants (see Bar-Natan 
(1995)). The same can be said of the Jones 
polynomial, as proved by a theorem of Birman and 
Lin (1993). There is an attractive formula due to 
Kontsevich expressing all Vassiliev invariants ana- 
lytically in terms of multiple integrals, assuming that 
the knot or link diagram comes with some generic 
Morse function (e.g., the projection of the planar 
diagram on the y-axis). Moreover, from the work of 
Kontsevich it follows that it is possible to give a 
purely combinatorial characterization of all Vassi- 
liev invariants (other than the one mentioned above) 
by associating to an oriented knot K in R? (given via 
coordinates z = z(t)(— x(t) + iy(1)), t) a chord diagram, 
which is just a circle with 2k distinct points labeled 
P; Oj, j— 1,2,..., k, marked on it, and by imposing 
certain relations on the free abelian group freely 
generated by all chord diagrams. 


Theorem 6 Let Vx(t) be the Jones polynomial of a 
knot K. Let Vx(q) be the infinite series obtained 
from Vx(t) by substituting e1(= 1+ q+ q*/2!+---= 
$5 oq" /n!) for t. So we may write 


Vk(q) = bo + big = bq? see 
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Then J,,(K) — b,, is a Vassiliev invariant induced by 
the Jones polynomial of order (at most) m. 


The structure and significance of the HOMFLY and 
Kauffman polynomials can be interpreted in the 
language of Vassiliev invariants, which are invariants 
of finite type. The notion of finite type is of 
extraordinary significance in studying these invariants. 
One reason for this is the following basic lemma: 


Lemma 7 If a graph G (an embedded 4-valent 
graph) has exactly k nodes, then the value of a 
Vassiliev invariant vy of type k on G, v,(G), is 
independent of the embedding of G. 


Let us show briefly this important result. Suppose 
V is any invariant of oriented links taking values in 
some abelian group. This V can be extended to be 
an invariant of singular links in the following way 
(Kauffman 2001): a singular link is an immersion 
of simple closed curves in $? with finitely many 
transverse double-points. These self-intersections are 
required to remain transverse in any isotopy 
demonstrating the equivalence of such singular 
links. If the definition of V has been extended over 
singular links with n — 1 double points, define it on 
a singular link Ly with n singularities by 


Wily) = Vile) — VU.-) 


where V(L,.), V(L,), and V(L. ) are identical except 
near a point where they form a node. Note that 
V(L,) and V(L_) each has » —1 double points. 
Then V is called a Vassiliev invariant of order n, or 
an invariant of finite type n, if V(L) —0 for every 
L with z-4-1 or more singularities. Recall the 
Alexander-Conway polynomial invariant, V; (z) € 
Z|z], of oriented links defined by Vunknor(Z) — 1 and 


VL, (2) — Vr. (€) = Vr) 


Extend this over singular links by the above method. 
Then if Lx is a link with r singularities, Vr. (z) — 
zV_,(z), where Lo is a link with r — 1 singularities. 
Thus, by induction on r, if L has r singularities then 
Vi (z) has a factor of z’. This implies at once that the 
coefficient of z" in the Conway polynomial of a link 
is a Vassiliev invariant of order n. Now suppose one 
considers the HOMFLY polynomial and makes the 
substitution (1, m) = (itN/2, i(t! — t!/2)). The char- 
acterizing skein relation becomes 


jg? PO, y = NI PLI, y = (e = sg P PL) 


Note that this becomes the Jones polynomial when 
N — 2. Now make the further substitution t= exp x. 
Here exp x should be thought of as the classical 
power series expansion. Of course, exp(x/2) and 
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exp(—x/2) have power series expansions; the power 
series can be multiplied and added to give another 
power series. Thus, P(L) has a power series 
expansion in powers of x. It follows immediately 
that P(L,)—P(L )-—xS(x) for some power series 
S(x) Hence, the proof used for the Conway 
polynomial shows at once that the coefficient of x" 
in the power series expansion of P(L) is a Vassiliev 
invariant of order n. 

All present studies of Vassiliev invariants clearly 
indicate a major role of these invariants in the future 
developments of knot theory and topological quan- 
tum field theories. Many questions in knot theory 
remain open, nevertheless, in future it will, very likely 
be one of the most fruitful and beautiful subjects of 
research in mathematics and in mathematical physics. 
Knot theory also attracts attention from the fact that 
it is revealing new astounding and profound links 
between geometry, algebra, and topology. 


See also: Finite-Type Invariants; The Jones Polynomial; 
Knot Invariants and Quantum Gravity; Knot Theory and 
Physics; Kontsevich Integral; String Topology: Homotopy 
and Geometric Perspectives; Topological Knot Theory 
and Macroscopic Physics; Topological Quantum Field 
Theory: Overview. 


Further Reading 


Alexander JW (1923) Topological invariants of knots and links. 
Transactions of the American Mathematical Society 20: 
257-306. 

Atiyah M (1990) The Geometry and Physics of Knots. Cambridge: 
Cambridge University Press. 


Matrix Product States see Finitely Correlated States 


Birman JS and Lin X-S (1993) Knot polynomials and Vassiliev’s 
invariants. Inventiones Mathematicae 111: 225-270. 

Burde G and Zieschang H (1985) Knots. Studies in Mathematics, 
vol. 5. Berlin: Walter de Gruyter. 

Crowell RH and Fox RH (1963) Introduction to Knot Theory. 
Toronto: Ginn & Company. 

De La Harpe P, Kervaire M, and Weber Cl (1986) On the Jones 
Polynomial. L’Enseignement Mathématique 32: 271-335. 
Dehn M (1914) Die beiden Kleeblattschlingen. Mathematische 

Annalen 75: 402-413. 

Frayd R et al. (1985) A new polynomial invariant of knots 
and links. In: Freyd P, Yetter D, Hoste J, Lickorish WBR, 
Millett K, and Ocneau A (eds.) Bulletin of the American 
Mathematical Society (NS) 12: 239-246. 

Jones VFR (1985) A polynomial invariant for knots via von 
Neuman algebras. Bulletin of the American Mathematical 
Society 12: 103-111. 

Kauffman LH (2001) Knots and Physics. Singapore: World 
Scientific. 

Kawauchi A (1996) A survey of Knot Theory. Boston: Birkhauser. 

Lickorish WBR and Millet K (1987) A polynomial invariant for 
knots and links. Topology 26: 107-141. 

Manturov V (2004) Knot Theory. Boca Raton, FL: Chapman and 
Hall/CRC. 

Murasugi K (1996) Knot Theory and Its Applications. Boston: 
Birkhauser. 

Neuwirth LP (1965) Knot Groups. Ann. Math. Studies, vol. 56. 
Princeton: Princeton University Press. 

Reidemeister K (1932) Knotentheorie. Berlin: Springer. 

Rolfsen D (1990) Knots and Links. Math. Lecture Series. 
Berkeley: Publish or Perish. 

Vassiliev VA (1990) Cohomology of knot spaces. In: Arnold VI 
(ed.) Theory of Singularities and Its Applications, Advances in 
Soviet Mathematics, vol. 1, pp. 23-70. Providence, RI: 
American Mathematical Society. 

Witten E (1989) Quantum field theory and the Jones polynomial. 
Communications in Mathematical Physics 121: 351-399. 


Mean Curvature Flow see Geometric Flows and the Penrose Inequality 


Mean Field Spin Glasses and Neural Networks 407 


' Mean Field Spin Glasses and Neural Networks 


| A Bovier, Weierstrass Institute for Applied Analysis 
_ and Stochastics, Berlin, Germany 


E © 2006 Elsevier Ltd. All rights reserved. 


Introduction and Models 


Rarely has a paper with a simple title as *A solvable 
model of a spin glass" had such a tremendous impact 
on both physics and mathematics as the seminal 
paper of 1972 by Sherrington and Kirkpatrick, 
which introduced what is now known as the 
Sherrington-Kirkpatrick (SK) mean-field spin glass 
model. As solvable as it might have appeared to the 
authors, it was soon found that the heuristic 
solution, based on the so-called replica method, 
was physically unacceptable. The reason was a tacit 
assumption, now known as replica symmetry, that 
proved unfounded. Several years later, Giorgio 
Parisi provided an ingenious way out through his 
continuous replica symmetry-breaking scheme, that 
presented a solution that, through its complexity 
and intrinsic beauty, both stunned and fascinated 
the community. Unraveling the mysteries involved in 
this solution has presented a challenge and driving 
force for the last three decades of mathematical 
statistical mechanics, while the use of the method in 
theoretical physics opened the path to solving a wide 
variety of problems not only in the theory of 
disordered magnets, but also in neural networks 
and combinatorial optimization. In this article the 
focus is on the mathematical results obtained in the 
study of this and a number of related models. 


Mean-Field Models 


Mean-field models have played an important role in 
statistical mechanics by providing simple, solvable 
models in which some of the complex phenomena, 
such as phase transitions, could be studied and under- 
stood. For example, the Curie-Weiss model of a 
ferromagnet describes N spin variables c; (taking values 
+1) in interaction. The simplifying assumption com- 
pared to more realistic models, such as the Ising model, 
is to ignore the spatial structure of the model and allow 
all spins to interact with each other with equal strength. 
This yields to a Hamiltonian function of the form 


N N 
Hy(0) = - 4010) +h oon | 
ij=1 =1 


where J is a coupling constant and h a magnetic 
field. This from of the interaction implies that the 


Hamiltonian is in fact just a function of the 
empirical magnetization »N(c) - N^! $5; , cj, and 
this allows one to use methods from the theory of 
large deviations to analyze rather easily the corre- 
sponding Gibbs measures 
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The SK Model 


This model was a straightforward attempt to 
introduce a mean-field version of models with 
randomly interacting spins. The interest in such 
models arose from the discovery of certain alloys of 
ferromagnets and conductors (e.g., AuFe and 
CuMn) that had been found to exhibit very unusual 
magnetic properties. Ruderman and co-workers had 
proposed that in these models the magnetic ions 
with magnetic moments S; and S; located at the 
points x; and x; would interact via an exchange 
interaction of the form 
cos(ky(x; — x;)) 
D XRcu Rm Ll 
x; — xj| 
Since the positions of the magnetic ions in the alloy 
are random, the signs of their interaction would be 
oscillatory. Anderson proposed a simplified model, 
in the spirit of the Ising model, where spins taking 
values +1 located on a regular lattice would interact 
via nearest-neighbor couplings J; modeled as 1.i.d. 
random variables uniformly distributed on an inter- 
val [— J, /]. In the spirit of the Curie-Weiss model, 
Sherrington and Kirkpatrick then proposed the 
mean-field model where any two spins would 
interact via i.i.d. Gaussian random variables J; of 
mean zero and variance one. The SK Hamiltonian is 
thus given by 


I 
un 
7 
a, 
| 


N 
=- ‘> Jojo; +b S 0% [3] 


1<i<j<N i=] 


where the normalization is chosen to ensure that the 
variance of Hy is an extensive quantity. Although 
the two Hamiltonians superficially look similar, the 
main feature that allows one to solve the Curie- 
Weiss model is absent in the SK model: there is no 
way to write the Hamiltonian as a function of 
macroscopic variable(s) such as the magnetization. 
This implies that all methods known to solve the 
Curie-Weiss model fail here. The approach used 
systematically in the physics literature to overcome 
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this difficulty is to try to compute the mean free 
energy f3.n = —(1/8N), Eln Zg n using the formal 
identity Inx — lim joq ! (x? — 1). For q € N, one 
easily sees that (putting b = 0) 


Zay = p on ENY ‘oe Y orotat) 


NRS a,b=1 i<j 


This expression looks already more like the parti- 
tion function of an ordinary mean-field model, and 
the computation with standard methods seemed 
feasible. However, passage to the limit g|0 remains 
a highly risky enterprise, and it took the genius of 
Parisi to develop an approach that provided at least 
a physically meaningful and convincing answer. 
The replica method being dealt with elsewhere in 
this encyclopedia, this approach is not explained 
any further here, although we will explain the 
nature of the result in the light of recent rigorous 
work later on. 


Site Disordered Models 


The difficulties encountered with the random-bond 
interactions led readily to proposals of mean-field 
models that were closer to the Curie-Weiss model - 
from the point of view that they allowed the 
Hamiltonian to be written as a function of macro- 
scopic variables. The most important of these 
models was introduced by Figotin and Pastur. Here 
the disorder was introduced as an M-dimensional 
vector & for each site i. The components of this 
vector are usually taken as i.i.d. random variables £^ 
taking values +1 with equal probability. One can 
then introduce M-dimensional vectors as macro- 
scopic variables that generalize the magnetization 
with components 


—1 
my (0 Ny fi 


The Hamiltonian can then be written as 


M 
cime! "uc 
= =o eg 


p 


These models were indeed found to be solvable with 
tools similar to those used in the Curie-Weiss case; 
however, they proved disappointing in that the 
solution did not show the characteristic features 
expected in a spin glass. In fact, it turns out the 
these models behave very much like a mean-field 


ferromagnet, except that as they display not just 
two equilibrium states at low temperatures, but 2M 
of them, concentrated on spin configurations o for 
which z:x(o) takes values close to one of the values 
+m*(3)e,, where e, is the -unit vector in R™ and 
m*(3) solves the equation m= tanh (Bm) known 
from the Curie-Weiss model. This model might 
have been forgotten, had it not been rediscovered in 
1982 by Hopfield in the context of neural net- 
works. Hopfield realized that if c; are interpreted as 
the activation states (“firing” and “not firing") of 
neurons in the brain, the form of the interaction in 
this model is exactly the one proposed earlier by 
Hebb for synaptic interaction. between neurons 
having “learned” the M “patterns” £^ in the past. 
He went on to interpret Hyx(c) as the Lyapounov 
function of the retrieval algorithm by which the 
brain would recognize the learned pattern. Natu- 
rally, the fact the the configurations £" are minima 
of Hy then implies the functioning of the algorithm. 
The important observation of Hopfield was that, 
based on numerical experiments, the algorithm 
failed when M became too large. In fact, he 
observed a breakdown of the memory if M > 
0.14N. This meant that the interesting asymptotics 
in this model required to consider M as an 
increasing function of N. This regime was not 
covered by large-deviation-type results and an 
intensive program to investigate this model was 
initiated. Again, the replica method could be 
employed and yielded a very rich structure of the 
model, including an explanation of the findings of 
Hopfield. These models also turned out to be an 
important starting point for the rigorous analysis. 


Gaussian Processes and Derrida’s Models 


While the models discussed so far were motivated 
from the point of view of randomly interacting 
spins, Derrida had the consequential idea to view 
the Hamiltonian of such a model simply as a 
random process indexed by the set of all spin 
configuration. In the case of the SK model, this 
process was, moreover, a Gaussian process and thus 
characterized entirely by its mean and variance. For 
h=0 we see that 


EH (o) Hx (v) = 


N| zZ 


(rN(c. a’) -5TN(e.0) 


where rw(o,0'  N ‘ojo’ is usually called the 
overlap. This opened the view to a much larger 
class of models. In particular, the simplest model 
from this perspective corresponds to taking Hx(c) as 
a process of i.i.d. random variables. Derrida called 
this the random-energy model (REM). He also noted 


that it could be seen as the limit if a sequence of the 
so-called p-spin SK models corresponding to the 
covariance of the Hamiltonian being N(rn(o,0’))?. 
On the other hand, Derrida observed that another 
class of models could be defined that were easier to 
analyze while exhibiting much of the complex 
properties of the SK model. These are obtained by 
choosing the covariance not as a function of the 
overlap (resp. the Hamming distance), but of a 
ultra-metric distance related to dx(o,o’) = N`! (inf 
(i:o; #a,}—1). These models, called generalized 
Random-Energy Models (GREM) were analyzed by 
Derrida and Gardner in the 1980s and are now the 
only models where the full predictions of the Parisi 
theory can be rigorously justified. This is discussed 
in some detail later. 


Further Models and Applications 


There is a wealth of problems that can be 
interpreted in terms of disordered mean-field 
models, and which may be analyzed using methods 
developed here. Some of the most notable ones 
that have received more attention lately include: 
the perceptron, a feed-forward neural network 
was analyzed first by Gardner using the replica 
method. Very recently, Shcherbina and Tirozzi gave 
a rigorous justification of this result. The 
p-satisfiability problem is an important problem in 
computer science that also can be analyzed with the 
replica method. Rigorous results are still very 
limited. The number partitioning problem can be 
formulated as a random-energy model. Also, the 
most famous problem in combinatorial optimiza- 
tion, the traveling salesman problem, can be solved 
heuristically with the replica method. Another 
emerging field are applications to coding theory. 


Formulation of the Problem 


Given a model, that is, a Hamiltonian function 
defined as a random process, the ultimate goal is 
to describe the asymptotic properties of the 
corresponding Gibbs measure, ideally identifying 
a (random or deterministic) limiting measure, as a 
function of the temperature, 5 !, and other 
parameters, such as the magnetic field p. 

The first steps in this direction concerns global 
properties: 


e Does the ground-state energy density, 


lim max Hx(o 
N]oo c€SN NI ) 


converge (in what sense?) and what is the limit? 
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e What is the limit of the free energy 
fon = m In Z 


It has been noted in the mid-1990s that such 
quantities are usually self-averaging, for example, 
in the sense that 


lim (fax — Efan) =0, as 


due to the concentration of measure phenomenon. 
However, until very recently, the existence of the 
limits was considered an open problem in most of 
the models described above. Guerra and Toninelli 
(2002) discovered that a clever use of comparison 
inequalities for convex functions of Gaussian 
processes allows one to prove a priori the existence 
of limits at least in the case of models based on 
Gaussian processes (SK, GREM). The main task is 
the computation of the values of the limit. 

If the free energy is known as a function of 
sufficiently many parameters, one can frequently 
compute a number of correlation functions that 
characterize the limiting measure as well. What one 
should compute is somewhat model dependent. 


Geometry of Gibbs Measures 
and Multi-Overlap Distributions 


The problem of satisfactorly describing the asymp- 
totic geometric properties of random Gibbs 
measures on {—1,1} is rendered difficult as the 
symmetries of the problem make the use of local 
topologies seem unattractive. A reasonable way of 
solving this problem is as follows. Let Dy be a 
distance on Sy normalized so that max, es, 
Dw(o, T)=1. Then consider the mass distribution 
around any fixed point c, 


m(x) = ua, N(Dn(o,0’) € x) 


and construct the biased empirical average 


Kan = > HB., N(7)6m,(-) 


TESN 


The set of distributions of these random measures 
is compact (with respect to the weak topology) 
and thus we can expect to construct limits. The 
law of Ka n is fully determined by the family of 
averaged distributions of the distances between 
independent copies of o drawn from the Gibbs 
measures, 


Bye" (Dn(o'" a^), -— Dyl t , g")) 
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In the SK models, one chooses 
1 
Dn(o,T) —1- NL 


so that these quantities can be expressed as distribu- 
tions of the overlaps (1/N)}>°,0;7;, between m 
“replica” spin variables. In the GREM models, it is 
natural to chose as distance the lexicographic distance 
used in the construction of the models. In this case, the 
limits of Kg, can be constructed explicitly and it was 
shown that they can be expressed in terms of the size- 
biased empirical family size distribution of a certain 
continuous state branching process via a model- 
dependent time change. Since this plays a key rôle 
not only in the GREMs but in other models as well, we 
will go into some detail to elucidate this structure. 


Neveu’s Process and Random 
Genealogies 


The random structure of the limiting Gibbs 
measures of the GREM models (and presumably 
also the SK models, even though this is not proven) 
can be traced to a continuous-state branching 
process introduced by Neveu, and an induced 
associated random genealogy on the unit interval. 
Let Z; be a time-homogeneous continuous-time 
Markov process with state space R, characterized 
by the Laplace transform of its transition kernel 


E(e*'|Zo = a) = exp (-aX `) 


Based on this process, construct a two-parameter 
process Z(t, a) with the property that, for any a,b > 
0, the processes Z(-, a) and Z(:,a-- b) - Z(-,a) 
are independent and have the same laws as Z; with 
initial conditions a, resp. b. It follows that Z(t,-) is a 
stable subordinator with exponent e '. Now let 
0,(a) = Z(t,a)/Z(t, 1), as a function on [0,1], 6; 
being a random probability distribution function (of 
pure point type). Any such family 4, of distributions 
defines in a natural way a genealogical structure on 
[0, 1]. Define the ancestor of o € [0,1] at time t < 1 
to be a,(a) = 6,(0;'(a)), where 0^! is the right- 
continuous inverse of the nondecreasing function 8. 

We say that, for a,a’ € [0,1], q(o, 0/) 2 t if and 
only if t= sup(s: asla) — a;(o/)). It is easy to see that 
1—34 defines an ultra-metric distance. We can 
associate with this the distribution size of the offspring 
of an ancestor at time t, Malt) — |a : q(o, o/) € t|, and 
its size-biased empirical distribution 


l 
c= | da bm, (-) 
0 


In the GREM models, it can be shown that the 
quantity Ka, n converges (weakly in law) to the 
corresponding K obtained from a time change of 
the family of measures 0;, namely 
07^ = m Ain m(t) —Inm/(0) 

where m is a nondecreasing function that can be 
computed explicitly. Namely, if EX,X,= 
A(dn(o,7)), and ā denotes the right-derivative 
of the concave hull of A, then 


Nix) = min(4! V21n2/ a(x), 1) 


As explained below, similar results are expected in 
the SK models. 


Interpolation Methods and Guerra's 
Integral Representation 


Among the very important tools for the analysis of 
Gaussian models in particular have been the inter- 
polation methods that allow one to compare 
functions of processes with different covariance. 
While these methods go back to early work on 
Gaussian processes (Slepian, Kahane), they have 
been employed with remarkable success in the 
present context. Mostly, they consist : y c 
an interpolating Hamiltonian | H'(c) = /tH(o 

v1 — £K(c), where K is a reference Lai that im 
certain ea the properties. Given any function F of 
the process (e.g., the free energy of the model), one 
then represents 


1 
F(H) = F(K) + / dr FH) 


Often the derivative on the right-hand side can be 
controlled rather well, for example, because of some 
obvious positivity properties. 


Example 1 (Guerra and Toninelli). Choose 
Jule R$. D 
Joc; + Jig 
VM = ] j N M i M41 


and consider the free energy F(H') 


=f, ny: Then, first 
F(A 0) = 


F(Hy) + F(Hy. M). On is other hand, 


/ 
J;0i6j 


d i cd cis | 
a FN )=— sagt 2 -Y Ge 


<j=l i<j=1 


" 3 Jaio | 
ieizMa1 V 4 — £)(N — M) 


A key tool to be used at this stage is the so-called 
Gaussian integration by parts formula, Egf(g)= 
Ef'(g). Applied here, this gives 


This proves superadditivity of Nf; N, 
NEfaN 2 MEfs,m + (N — M)Ef5N-M 


which, in turn, implies convergence of Ef;w to 
a limit Ef. Moreover, standard concentration 
of measure estimates show then that f3N also 
converges almost surely. 


Example 2 (Guerra, Aizenman-Sims-Starr). A 
more complicated application of the interpolation 
method allows one to relate the free energy to 
Paris's solution. This was first found by Guerra 
(2003), but a different, and in some sense more 
intuitive formulation, was given later by Aizenman 
et al. (2003). It is based on the following construc- 
tion. We consider a centered Gaussian process Hy(o) 
on Sy with covariance given by Ng(Rw(o,o')) for 
some even convex function g:[— 1, 1] — [0,1]. Let 
us take F(Hy) = In E, e"? (the a priori expecta- 
tion E, need not be symmetric, but may incorporate 
a magnetic field). Before using comparison, we now 
want to go to a larger space. For this, introduce some set 
A equipped with some positive-definite quadratic form 
q, normalized such that 44,,—1, and |qo.a’| € 1, 
Va.wcA. Let Pa denote some probability measure 
on A. Now introduce a centered Gaussian process 
Ka On A, independent of Hx, whose covariance is 
given by EK = 1(da, o!) = do a É (door) i g(da,«). 
Define 


G(Hy + VN) = In(E, Ea e-P iv) Ns.) 


Obviously, G(Hw, K) = F(Hn) + F(x), where F(k) = 
In(E, e-?VN*^). The amazing idea is now to 
compare the process (Hy + &) with another process 
7];,. Whose covariance is a linear function of Ry(o) 
(this is in some sense a Slepian's process), and that 
otherwise is smaller than the covariance of (Hy + 
K); to wit 


Ero.ot]o' o z5 Rx(o. 9 )g (da. a) 


By these choices of covariances, one has that for x € 
[- 1, 1], y € [0, 1], since g is even and convex, 
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g(x) + yg (y) — (y) 2 xg' (y) 


It is an immediate consequence of Kahane's theo- 
rem, respectively the same interpolation argument 
given above, that 


EG(HNw + Kk) € EG(1) 
which translates into 
EF(Hx) € EG(n) — EF(&) 


It is clear that we can optimize this bound by 
choosing A,q, and Pa. Of course, the difficulty 
would be to find such a minimum. A first 
simplification of this optimization problem is to 
consider instead of the deterministic structure of P 
and q random-probability measures on the space of 
probability measures and quadratic forms on .A, to 
average over the preceding equation with respect to 
their laws, and then take the infimum over all such 
random structures. This gives a (still incalculable) 
bound that Aizenman et al. (2003) have shown to be 
asymptotically sharp, that is, they showed that 


lim EF(Hx) = lim int E (EG(n) — EF(x)) 


where u is short for all probability measures on the 
space of (Pa,do,a) on A (called “random overlap 
structures" (rosts) in Aizenman et al. (2003)). Guerra’s 
bound consists in restricting the infimum to a class of 
rosts where the bound is calculable ‘explicitly’. 
Maybe unsurprisingly, this is exactly the class of 
asymptotic models that have already arisen in the 
GREMs. In fact, we set A=[0, 1], Mt = {m: [0,1] — 
[0, 1], non-decreasing}, let g be the random genealo- 
gical distance associated to the family of measures 67”, 
and let P, be the probability measure on A whose 
distribution function is 07" (a). Then Guerra's bound 
states that 


= 7 < ‘ 5 = ee " 
lim EF(Hx) < im mt EG(n) — EF(«) 


where the expectations relate to all random quan- 
tities involved. By self-averaging, the same result 
holds almost surely. The right-hand side of this 
equation is known as (a particular formulation of) 
the famous Parisi solution. In fact, define the 
function f(q,y) as the solution of the nonlinear 
partial differential equation 


1 
9f +5 (BF + maf) — o 
with final conditions 


f (1,4) = ln cosh gy 
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These equations can be solved by elementary means 
in the case when m is a step function. It turns out 
that, for given 71, 


2 pl 

8G (n) - EF(x)) = f(0,5,m, 8) -5 f ami) dria) 
where h = 37! cosh ! (E01). This solution was origi- 
nally obtained using the replica method. The preceding 
construction gives, at the least, a clear mathematical 
meaning to the objects involved. In particular, the 
notion of “ultra-metric zero-dimensional matrices,” 
appears now to be equivalent to ultra-metric structures 
on the unit interval. 

In a recent paper, Talagrand (2003) has proven 
that converse inequality is also true in the preceding 
equation, confirming that Parisi's solution yields the 
correct free energy in a large class of models of the 
SK type. 


Ghirlanda-Guerra Relations 


The appearance of a universal probabilistic structure 
in the asymptotics of these models may appear 
surprising. A partial explanation can be found in a 
set of remarkable identities between multi-overlap 
distributions that has been discovered first by 
Ghirlanda and Guerra (1998) in the context of SK 
models. If j/2^, denotes the n-fold product Gibbs 
measure, the Ghirlanda-Guerra relations assert a 
recursion relation of the form 


Eon (Dulo, o) < t|Bn) 


=~ > Bass (Duto". ot) < 11B,) 
t#k 


d 
* Ens x(Du(c! o^) € tB.) + o(1) 


These relations hold generically for Gaussian mean- 
field models, with Dy being the distance through 
which the covariance is defined. The proof of these 
relations is based on Gaussian integration-by-parts 
formulas, and concentration of measure inequalities. 
In the case of the GREM models, where Dy is ultra- 
metric, these recursions are sufficient to determine all 
n-replica overlap distributions in terms of the 2-replica 
distribution. On the other hand, the set of z-replica 
overlap distributions determines the law of the process 
K and thus the geometry of the Gibbs measure. In 
particular, they leave time changes of Neveu's process 
as the only candidates for limit processes. In the case of 
the SK models, the same does not hold a priori, since 
the Hamming distance is not an ultra-metric. How- 
ever, since the Parisi solution is correct, this suggests 


very strongly that asymptotically the overlap distances 
are almost surely (with respect to the Gibbs measure) 
ultra-metric. Then, the Ghirlanda-Guerra identities 
also imply that the geometry of the Gibbs measures is 
described by the same structure. 


From Mean-Field to Lattice Models 


One of the widely discussed issues in the theory of spin 
glasses is to what extent the results of mean-field 
theory are relevant for lattice models. This issue has 
been addressed elsewhere in this encyclopedia by 
Newman and Stein. Here, we will only mention a 
recent result of Franz and Toninelli (2004) that shows 
that the free energy of the SK model can be represented 
as the limit of the free energy of lattice models when 
the range of the interaction tends to zero while their 
strength tends to zero in an appropriate way (the so- 
called Kac models). This still leaves open many finer 
questions, but hints to the fact that mean-field theory 
bears at least some relevance for realistic spin glasses. 


See also: Short-Range Spin Glasses: The Metastate 
Approach; Spin Glasses. 
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Introduction 


Loop spaces have been considered for their geo- 
metric interest (Freed Daniel 1988) where the space 
of based loops on a compact Lie group is endowed 
with a Kahlerian structure; see also the survey by 
L Gross (1988). The harmonic analysis on loop 
groups, developed by Pressley and Segal, is 
reviewed by Hsu (1997). Loop groups have also 
an impact in string theory (Bowick and Rajeev 
1987). They are related to Yang-Mills theory (Levy 
2003). A presentation of the history of measure on 
infinite-dimensional spaces has been given by 
P Malliavin (see Malliavin (1992) and references 
therein). The main problem is the construction of 
measures on the loop space which have quasi- 
invariance property. This has implications in 
representation theory (Neretin 1994, Jones 1995). 
Here we mainly concentrate on the nonlinear 
stochastic point of view and its interference with 
geometry. The geometrical study of the space of 
closed curves over a compact Riemannian manifold 
M, that is, the loop space over M, was initiated by 
Marston Morse in 1932. The loop space is itself a 
manifold where one can define a Laplace-Beltrami 
operator. A diffusion process can be considered on 
this manifold. Wiener defined the Brownian loop 
by the Fourier series 


u(r) = SE g, a 


k 1 


where the G, are independent normal variables. 
The time evolution of the Wiener loop and the 
extension of the theory to the case of a compact 
Riemannian manifold of finite dimension has been 
considered by Airault and Malliavin (1996, and 
references therein). The Brownian loop evolutes in 
the time parameter t as a Brownian sheet where 
the independent random variables G are function 
of t. 

Starting from the zero loop, one obtains at time f, 
a random loop, and the law of this loop gives a 
measure on the loop space. A construction of this 
measure with functional analysis on  infinite- 
dimensional manifold was done by Gaveau and 
Mazet (1979). The tools of stochastic analysis are 
important to the subject. The loop space of 
continuous maps from the circle to the multi- 
plicative group of complex numbers has a group 
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structure, hence the term “loop group." On the loop 
group, we consider the multiplicative Brownian 
motion starting at one point of the circle and 
conditioned to come back at this point at time s. It 
defines a probability measure on the loop group. 
One can also consider the set of continuous maps 
from the circle to the set of complex numbers of 
modulus equal to 1. The loop group is the space of 
continuous closed paths on a Lie group. More 
generally, on a Riemannian manifold M, the 
Brownian motion on M defines a Wiener measure 
on the loops over M. To go from the path space to 
the loop space, an important tool is the quasisure 
analysis in infinite dimension. The quasisure analysis 
was developed by Airault and Malliavin (1996, and 
references therein) to obtain disintegrations of the 
Wiener measure and they have used this tool in 
1992 to construct measures on the loop group. The 
main problems are: 


1. The construction of heat kernel measures and the 
existence of a Brownian motion on the loop 
space, the existence of pinned Wiener measures 
obtained as the law of Brownian motions condi- 
tioned on the loops. 

2. The quasi-invariance of these transition prob- 
ability measures under translation, or multi- 
plication if we have a multiplicative structure, or 
under the infinitesimal action of suitable vector 
fields. For the path space over the n-dimensional 
Euclidean space R", the Cameron-Martin theo- 
rem (1944) ensures the existence of a density 
which shows the quasi-invariance of the Wiener 
measure under translations. For the quasi- 
invariance, an important fact is the choice of 
the metric on the Cameron- Martin space. In the 
case of the Wiener measure, one considers the 
paths of finite energy, E Ib (s) ds < --oo. This 
corresponds to the metric “1.” P Malliavin 
(1989, and references therein) discussed the 
case of metrics a with 1/2 « a « 1. 

3. To define the “good” Cameron subspace, that is, 
find the vector fields that yield integration- 
by-parts formulas. The question occurs whether 
the Cameron—Martin space depends on time. For 
the loop space, it has been proved by Driver 
(2003) that it is not the case. A time evolution of 
the tangent Cameron—Martin space could appear 
eventually. 

4. The determination of the support of the measures 
(e.g., the Wiener measure) is carried by the set of 
Holder functions of order 1/2 — c. 

5. The absolute continuity of the measures with 
respect to each other. 
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The Construction of Heat Measures 
on the Loop Space and Their 
Quasi-Invariance 


The construction of measures giving a solution to 
the infinite-dimensional heat equation as well as the 
study of the quasi-invariance of the Wiener measure 
on the path space was started extensively in the 
work by Bismut, followed by Gross (1998), then by 
Aida and Elworthy (1995) where the loop group is a 
suitable manifold to extend to infinite-dimensional 
manifolds the log-Sobolev inequalities, by Malliavin 
and Malliavin (1992, and references therein) where 
the measures on the path space and the path group 
have been studied. Consider a compact Lie group G 
with unit e and let G be its Lie algebra. From the 
G-valued Brownian motion, one can construct a 
family of measures (1/5), 9 on the path space. These 
measures pf are the images of the Wiener measure 
on G through the Ito map 


dg. (7) m Vt g.(r)dx(T) with gx(0) = © [2] 


The convolution of two measures j/£ and u, is equal 
tO Li, y- By choosing the initial value of the path 
randomly distributed according to the Haar measure 
on G, it defines a family of measures (14); «9 on the 
path space with 


| fete) = f dg f fuia») 


The Laplacian on the path group is defined by 


(ef (g) = lim =| f finis) - fla) 


The heat equation is valid for the measures (p); >0 
on the paths, 


a, | fledm(de) = [ vua 


Moreover, there is a quasi-invariance density k,,(g) 
defined on the path group (go and g are paths with 
values in G) such that 


Iu (goA) = / ks, (g) ur (dg) 


where goA is the translated on the left of the subset 
A in the path space over G. This is a generalization 
to the path space of the classical Cameron-Martin 
theorem. Then, one can consider the loop space. The 
free loop space is the set of continuous maps g from 
[0, 1] to G such that g(0) — g(1), and the loop space 
with a base point is the set of maps such that 
g(0) — g(1) 2n is fixed. One can define the pinned 
Brownian motion on the group G to obtain the 


pinned Wiener measures (iG? hs on the loop group 
(Malliavin and  Malliavin 1992, Driver and 
Srimurthy 2001). Denote by p;(g) the solution of 
the heat equation on the group G. Let g be a map 
from [0, 1] to the finite-dimensional Lie group G. For 
71,724... 74 € [0, 1], consider the evaluations of the 
Map g; 275£n5--->87, € G, Let f be a real function 
defined on G and denote by dg the Haar measure on 
G. The measure ut“ on the loop group is given by 


J f(£.85.-...£,) dut" (g) 


= [ fim. abs (21 )Ptin—n) (81 82) °° 


x Pi, —, xe Ebi- (Bn) dgi denn dg, 


From př“, one defines a measure ul on the free 
loops by taking the mean over G as 


f feit = i dg [fru (da) 


The quasi-invariance property for the pinned Wiener 
measure was proved by Malliavin and Malliavin 
(1992). 

When the measures (u£),29 are obtained by 
conditioning and quasisure analysis, we have heat 
kernel measures. The case of heat kernel measures 
defined on the loop group has been studied by 
Airault and Malliavin by disintegrating the measures 
on the path space and using the quasisure analysis. 
The Laplacian on the loop group is defined as it has 
been for the Laplacian on the path space, 


(Az f)(g) = lim | f f (egi)ul (dgi) — f(g) 


but now the heat equation has a Kac's potential ®, 
defined on the loops. On the loop group, the heat 
equation is 


x f fuk (dl) = f i^n +D) — (3 


where 


1 ^ d 
o, (I) E ; ^gr oP) 


| 
| dl(s)I(s) ! 
0 


- -dimg 


The case of the circle, G — R/27Z, is interesting. 
The law of the functional 


n dl(s)l(s) ! 
0 


is given in Airault and Malliavin (1996, and 
references therein). Moreover, the study of the heat 


measures over the loop group of R/27Z brings new 
identities on the classical Jacobi theta function 


p:(@) =1+2 S cos(n6) e! ar g=0 


nl 


Let 


d 1 
C; = ~2 7 log p:(0) = " 
The following system of differential equations is 
given by Airault-Malliavin (1996, and references 
therein): 


To pass from path space to loop space, it is 
convenient to use the “tubular chart" introduced 
by Gross and the quasisure analysis developed by 
Airault-Malliavin. Let $:4-—4(1)4(0) ! from the 
path space to the group G; then the free loop 
space over G is ®'(e). There exists a neighbor- 
hood V of the neutral of G such that ®'(V) is 
diffeomorphic to V x L(G), the product of V with 
the loop space over G. With this diffeomorphism, 
one can disintegrate the measures on the path 
space and obtain the measures on the loop space. 
The Cameron-Martin formula on the path space 
of the group G is obtained from the Cameron- 
Martin formula for the Wiener space and the Ito's 
map. Let y be a differentiable path with finite 
energy on G, that is, 

l ad 2 
f |p i| < + 


it holds 


J fisdinddg) = J (Gg). (g) (dg) 


Let us denote by (|)¢ the Euclidean scalar product on 
the Lie algebra G; then the density is given by 


à; d 
k.(g) =exp E [ Ge Erodes) 


2 
«| 
G 


The previous approach relies on the heat equation 
on the loop space. Thus, the metric on the 
Cameron-Martin loop or path space is important. 


G 
1 1 


ak ad 
2t Jo 


Y)! IG) 
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The problem of quasi-invariance for metrics œ with 
1/2 < a < 1 relates to the random series 


nth = y ts 4 


where the G, are independent normal variables. 
Driver (2003) solved the problem for 1/2 <a< 1 
by Riemannian geometry in infinite dimension. 
The Ricci curvature appears in the integration-by- 
parts formulas on the loop space. The case of the 
metric 1/2 is out of reach. Fang (1999) calculated 
the Ricci curvature of the loop manifold for 
metrics o > 1/2 and showed that when o — 1/2, 
these Ricci curvatures tend to a limit. Another 
presentation of the problem is that of Pickrell 
(1987), where he obtains a family of quasi- 
invariant measures on Grassmannians. 

Given a family of measures (j;), «9 on the path 
space of a Riemannian manifold, one defines a heat 
operator as a family (£;), «9 of operators depending 
upon £ € [0, —-oc[ such that 


d 
f GF dy =< Í Fdy, i5 


where F is a function defined on the path space. The 
heat equation with a potential as [3] gives an 
example of a heat operator. Heat operators have 
been constructed for the path space over R" by 
Airault-Malliavin, obtaining, after an integration by 
parts on the path space, a heat operator of first 
order. This introduces the notion of dilatation vector 
fields on the path space. In the case of the flat 
Wiener space, to each point x in the path space is 
associated the dilatation vector field Y such that 
(Yf)(x) 7 (x|(grad f )(x)). This gives a rescaling of the 
Wiener measure under dilatations. This idea has 
been exploited by Mancino (1999), who extended 
the method to free loop groups. 


Integration-by-Parts Formulas 


The Cameron-Martin space plays the role of the 
tangent space to the Wiener space. The integration- 
by-parts formulas are an infinitesimal version of the 
Cameron-Martin quasi-invariance property. Let G 
be a compact Lie group or any product of R" by a 
compact Lie group. For a vector field z, the 
differentiation on the right 0"'8"' and differentiation 
on the left 9'*** are given by 


ale F(p) — lim F(exp(ez)p) = F(p) 


e—0 € 
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and 


Ore p) — lim F(p exp(ez)) = F(p) 


c—0 € 


The operator 0!'#"" commutes with the translation on 
the left, for a teamdation "nem then Qrieht( p o left) — 
(or^: Fo lef and vice versa cs a, 

For the measures on the path space or loop space, 
the problem is to prove the integration-by-parts 
formulas. On the path spaces on G, let jup, be the 
Wiener measure on the set of paths starting from e, 
there exists a density k, such that E[ exp (ck,)] is 
finite and 


[a Fg) du. = | Fk duo. (o) 
P.(G) P.(G) 


The density &; is defined on the path space by 


^ 


1 
k-(g) = f « g(t)z (t)g(t) !,du(t) > 


This was proved by a number of authors (see, e.g., 
Pickrell (1987) and, in a geometrical context, 
Cruzeiro and Malliavin (1996 )). 

The existence of a density for the differentiation 
on the left is valid for any Lie group. This is not true 
for the differentiation on the right. If G is 
noncompact or is not the product of R" by a 
compact Lie group, the existence of k, is not proved 
on the right. This comes from the fact that the map 
Ad defined on the path group as a parallel transport 
does not preserve the Cameron-Martin subspace. In 
the case where G is not a product of a flat space by 
a compact Lie group, the Cameron space, which is a 
kind of “tangent space" to the infinite-dimensional 
loop manifold, is not closed under the Lie bracket of 
vector fields. 

The integration-by-parts formulas are obtained 
with the stochastic calculus of variation. On a group 
G, consider Y1, Y2,..., Yp, p independent left- 
invariant vector fields. Let G be the Lie algebra of 
e The second-order differential operator A= 

+ Y? defines a left-invariant diffusion g,(t) on 
the group G with the stochastic equation 
dg..(£) —g.t) |^, (Yi),0 du | where (w) are inde- 
pendent Brownian motions on the Euclidean space 
G. In the work by Malliavin and Malliavin (1992, 
and references therein), the stochastic calculus of 
variation is done with the right-invariant connection 
on the Lie group by setting 


gh d " 
h" ght — de |«—o (g,..5)0g,;! 


where þh is a differentiable function of t with 
values in the Lie algebra G, with finite energy 


h Ib (s)|* ds < +0. By taking the derivative with 
respect to € in the Stratonovitch equation 
m, i 
g(t) odg'(t) 


and letting c — 0, it turns out that "8" is a differenti- 
able function of ¢ and its derivative is given by 


= dw(t) + eb (t) dt 


AL. us MN. diced 

et) = g OP (tgelt) ^ 

^a 
The situation is not the same for 

ag d " 
gt = Too Su (Butoh) 
€|c—0 

where dól"(r) is a stochastic differential. This 


generalizes to an arbitrary Riemannian manifold 
using a coupling of connections (see Airault and 
Malliavin (1996), and references therein). The 
construction of the appropriate Cameron subspace, 
that is, the choice of the infinitesimal action of 
vector fields on the measure, is of importance. In the 
commutative case of the path space over R", the 
classical Cameron-Martin subspace of paths ^ euch 
that [^ Ib (s) ds < +00 is time invariant. To define 
the vector dis acting on the path (or loop) space 
over M, it is necessary to consider the geometry of 
the manifold M. The infinitesimal transformations 
which preserve the Riemannian metric are called 
Riemannian connections. In the case where M is a 
group, the natural connections are those defined by 
the parallelism on the group. For a Riemannian 
manifold, Driver proved the existence of integration- 
by-parts formulas for the measures on the path 
space of M when M is endowed with a torsion skew- 
symmetric connection. The Levi-Civita connection, 
since it is torsionless, is of course a Driver (2003) 
connection. If the connection is not skew-symmetric, 
then two coupled connections permit study of the 
c-variation or “reduced variation" of a path, and one 
obtains a Cameron-Martin formula on the path and 
on the loop space of the Riemannian manifold M 
(Fang 1999). The method of reduced variation can be 
used to obtain the integration-by-parts formulas over 
path and loop spaces. Another approach to the quasi- 
invariance problem, using two-parameter processes, 
has been provided by Norris (1995). 


The Support of the Measures and 
Absolute Continuity with Respect 
to Each Other 


Given a Riemannian manifold M, let (si;), be the heat 
kernel measures on the path space of M and let (p,), 
be heat kernel measures on the loop space of M; the 
question arises whether p, is absolutely continuous 


with respect to us. For a connected compact Lie 
group G, consider the path and loop groups on G. 
The pinned Wiener measure on the loop group is 
defined as the law of a G-valued Brownian motion 
starting at e and conditioned to end at e, and the heat 
kernel measure is the endpoint distribution. of 
Brownian motion on the loop group. 

It has been shown (Driver and Srimurthy 2001) 
that the heat kernel measure is absolutely continuous 
with respect to the pinned Wiener measure, and that 
the Radon-Nikodym derivative is bounded. This 
proof relies on the heat formula with a potential 
[3], which is satisfied by the heat kernel measure. 
They give a new proof of this heat formula. When the 
group G is simply connected, Aida and Driver (2000) 
prove that the heat kernel measure over a based loop 
group, constructed by using the Brownian motion is 
equivalent to the Brownian bridge measure over a 
based loop group. When G is the circle, the Radon- 
Nikodym derivative of the heat kernel measure with 
respect to the pinned Wiener measure can be 
calculated in terms of the Jacobi theta function 
(Driver and Srimurthy 2001). On the loop space of 
R”, at time ¢, the two measures, “heat kernel" and 
“pinned Wiener" are the same. 


See also: Abelian and Nonabelian Gauge Theories Using 
Differential Forms; Lie Groups: General Theory; Malliavin 
Calculus; Path Integrals in Noncommutative Geometry. 
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Introduction 


The theory of metastability studies the states of 
the matter which “should not be there,” but which 
still can be observed, albeit for only a short time. 
One example is water, cooled below the zero 
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temperature. This supercool water can stay liquid, 
but not for a long time, and it then freezes abruptly. 
Such states are called metastable. They are not 
equilibrium states; at negative temperatures the only 
equilibrium state of water is ice. Physically, these 
metastable states are produced from the equilibrium 
states by slowly changing the external parameters, 
such as the temperature (or magnetic field): one 
takes, for example, water (extremely purified) at low 
positive temperature, T > 0, and then lowers the 
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temperature slowly to negative values T « 0. Thus, 
the family of metastable states, sr, T < 0, should 
be thought as a continuation of the family sr, T > 0 
of equilibrium states through the point of phase 
transition T, — 0, at which critical temperature these 
states cease to exist as equilibrium states. 

Below we will present rigorous results, which 
validate the above picture for the case of the 2D 
Ising model. They are contained in Schonmann and 
Shlosman (1998). The relevant external parameter 
in this case will be the magnetic field, P. 

It turns out that the lifetime of metastable states is 
determined by the quantities given by the Wulff 
construction. 


Equilibrium States and Dynamics 


Let us denote the set (— 1, 4]7 of the Ising model 
configurations ø by Q. Two configurations are 
specially relevant, the one with all spins —1 and the 
one with all spins +1. We will use the simple 
notation — and + to denote them. 

Observables are just functions on €). Local observ- 
ables are those which depend only on the values of 
finitely many spins. 

We will consider the formal Hamiltonian 


H, (o) = — * c(x)a -hX ' a(x) [1] 


x,y n.n. 
where h € R! is the external field and ø € Qis a generic 
configuration. We define, for each set A CC Z^ and 
each boundary condition € € €), 


Hyep(o) — —9 . o(x = Š e(x)&(y) 
p xe S 
— hb o(x) 
xc€A 


The *grand canonical Gibbs measure" in A with 
boundary condition £ under external field 5 and at 
temperature T is defined on Qa as 


= Axe T,h exp(—GH 4. &, b (c)) 


where 8 — T^! , and the partition function Z4, &, T, b IS 
a normalization, chosen such that p4 c 7,4(Qa) = 1. 
The equilibrium states are obtained by taking the 
thermodynamic limit lim, 72 ij, c T, 5. We will be 
interested in the states 


IA, £, T.5 (0) 


Hx T,b = lim. LN 3 Tb 
AZ 
corresponding to (+)-boundary conditions. If h Æ 0, 
then 4L, 7,4 = L4, r, p, SO it will be denoted simply by 
LT, p. If b —0, the same is true if the temperature 
is larger than or equal to a critical value T; = T., and 


is false for T < Te, in which case one says that there is 
phase coexistence. The measure p4 T,0 = [4,7 is 
called the (+)-phase, and p- 7— the (—)-phase. 

For an observable f we will denote by (f), its 
expected value in the state j4,, that is, the integral 
| f dj... In particular, the spontaneous magnetization 
m* (T) equals by definition to (a(0)), r- 

Next, we need to supply the Ising model with the 
time evolution. For this we will use the Glauber 
dynamics. It is a Markov process on Q, whose 
generator, L, acts on a generic local observable f as 


(Lf)(o) = 5 , e(x,0)(F(o*) — f(o)) 


xez? 


where o* is the configuration obtained from o by 
flipping the spin at the site x to the opposite value, 
and c(x, c) is the rate of the flip of the spin at the site 
x when the system is in the state o. In words, one 
can say that the dynamics proceeds as follows: at 
every site x the spin o(x) is flipped randomly, 
independently of all others, with the rate c(x,c), 
where c is the current configuration. Common 
examples are “metropolis dynamics": 


cy (x, o) = exp( -B(A«Hy(o))") 


or *heat bath dynamics": 
cy (x, 0) = [1 + exp( 4A Hy (o))] | 


Here (a)'— max{a,0}, and A,H,(c)=H,(o*) — 
H,(c). The spin flip system thus obtained will be 
denoted by (ø$ Or. i) t0» where € is the initial con- 
figuration at time 7? — 0. If this initial configuration 
is selected at random according to a probability 
measure v, then the resulting process is denoted by 
(oF, perdt>o- 1t is known that the Gibbs measures are 
invariant with respect to the stochastic Ising models. 
Moreover, 


= t 
OT bt 7 H-T hi OP py 7 U4,Tpbs as k= 60d 


We will be interested in the case when P is 
positive, though small. Then there is only one 
invariant state, H4 T. 5, SO the state 7, is equal 
to H4 7.4, and OT p; 7 H4,T,bs aS £ — oo. (One 
should intuitively think about the state OT, pt for 
t small as the supercooled but liquid water, 
thinking about the state 4, rp to be ice.) We 
want to control the convergence of the temporal 
state c7 p, to the equilibrium, 44, T,p, and to see, if 
possible, that during some (long) initial time the 
state Or p, looks very similar to the (—)-phase 
u- r, while after some time threshold it changes 
suddenly and looks quite similar to the state 
H4 T,p. lt turns out that all the above features 
can indeed be established rigorously. 


If one starts to simulate the above dynamics 
on a computer, then the picture observed would 
be the following: one would see that droplets of 
the (+)-phase are created in the midst of (—)-phase 
droplets, which are there for a while, and then 
disappear. That process goes on for a while, until 
a big enough (+)-droplet is born; this one then 
starts to grow and eventually fills up all the 
display. 


The Life Span of Metastable States 
Let us define the “critical time exponent” A, = AL(T) by 


W 


e = Tipe (TT " 


where w,=w,, is the value of the surface energy 
of the Wulff curve of our 2D Ising model at the 
temperature T: 


w, = W,(25,) 


Suppose now that T < T.,) > 0. Let v be either the 
(—)-phase u- r or s=. (In fact, any v “between” 
these two states would go.) Then the following 
happens. 


1. If 0 € A € X, then for each n € (1,2,...] and for 
each local observable f, 


I (f Gar = — 


n—1 


= V bi(f)bi + Olh") [3] 
j-0 
where 
a d 
WT RI dE 


(We stress that in the last relation we are using 
the Gibbs states corresponding to the negative 
values of the magnetic field.) In particular, 


E (ofr pce = exp py) (0) 
= —m"(T) + O(h) [4] 
2. If A > X, then for any finite positive C there is a 


finite positive C, such that for every local 
observable f, 


[E (f (etus epum) ) = (Pra 
< Cillflexp{—5 | 5 
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The relation. [3] implies that the family of 
nonequilibrium states (7 ,,,, ^ >0, defined for 
every local observable f by - 


(fry, ^ E (f Gar = -— 


is a C*-continuation of the curve {(-)_ r p ^ < 0} of 
equilibrium states. This is true for every 0 < A < X, 
and every v as above. The states (7 ,., are the 
*metastable states" we are looking for. The relations 
[3] and [4] should be interpreted in the sense that 
before the time exp{A,/h} our temporal state is still 
"liquid," while [5] means that after the time 
exp{A./h} freezing happens. So one can think about 
the quantity exp{A./h} as being the life span of the 
metastable state. 

This theorem was obtained in Schonmann and 
Shlosman (1998). Let us explain the heuristics 
behind it. It has two ingredients. The first one is 
that the transition to the equilibrium is going via 
creation of droplets of the (+)-phase. The second 
one is that once such a droplet is created by a 
thermal fluctuation, with the size exceeding a certain 
critical value, it does not die out, but grows further, 
with a speed v of the order of h. (This second belief 
can be expected to be correct only in dimension 2.) 
Let us see how these two hypotheses can give us the 
right answer. To get to the equilibrium we have to 
overcome the energy barrier, by creating a large 
droplet of the (+)-phase. Subcritical droplets 
are constantly created by thermal fluctuations in the 
metastable phase, but they tend to shrink. On the 
other hand, once a supercritical droplet is created 
due to a larger fluctuation, it will grow and drive the 
system to the stable phase. Indeed, the energy ®(m) 
of an m-shaped droplet of the (+)-phase in the sea of 
(—)-phase equals W,(m)—2m*(T)h vol(m). For 
small m the functional #(m) decreases as m shrinks, 
while for large m the functional (m) decreases as m 
grows. Its saddle point m, is precisely the Wulff 
shape. Since the minimal height of the barrier is 
(mai), one predicts the rate of creation of a critical 
droplet with center at a given place to be 


Comparing with [2], we see that we miss the 
correct answer 


by a factor of 1/3. The reason for that is the 
following. Note that we are concerned with an 
infinite system, and we are observing it through a 
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local function f, which depends on the spins in a 
finite set supp(f). For us, the system will have 
relaxed to equilibrium once supp (f) is covered by 
a big droplet of the (+)-phase, which appeared 
spontaneously somewhere and then grew, as 
discussed above. We want to estimate how long 
we have to wait for the probability of such an 
event to be close to 1. If we suppose that the 
radius of the supercritical droplet grows with a 
speed v, then we can see that the region in 
spacetime, where a droplet which covers supp (f) 
at time t could have appeared, is, roughly speak- 
ing, a cone with vertex in supp (f) and which has 
as base the set of points which have time 
coordinate 0 and are at most at distance tv from 
supp (f). The volume of such a cone is of the order 
of (vt)^t. The order of magnitude of the relaxation 
time, 4,4, at which the region supp (f) starts to be 
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Introduction 


Soap films, soap bubbles, and surface tension were 
extensively studied by the Belgian physicist and 
inventor (the inventor of the stroboscope) Joseph 
Plateau in the first half of the nineteenth century. At 
least since his studies, it has been known that the 
right mathematical model for soap films are minimal 
surfaces — the soap film is in a state of minimum 
energy when it is covering the least possible amount 
of area. Minimal surfaces and equations like the 
minimal surface equation have served as mathemat- 
ical models for many physical problems. 

The field of minimal surfaces dates back to the 
publication in 1762 of Lagrange’s famous memoir 
“Essai d'une nouvelle méthode pour déterminer les 
maxima et les minima des formules intégrales 
indéfinies." Euler had already, in a paper published 
in 1744, discussed minimizing properties of the 
surface now known as the catenoid, but he only 
considered variations within a certain class of 
surfaces. In the almost one-quarter of a millennium 
that has past since Lagrange's memoir, the subject of 
minimal surfaces has remained a vibrant area of 
research and there are many reasons why. The study 
of minimal surfaces was the birthplace of regularity 
theory. It lies on the intersection of nonlinear elliptic 
PDE, geometry, topology, and general relativity. 


covered by a large droplet can now be obtained by 
solving the equation 


2 P( Mea) 
(vta) trel expl- d ^s 1 
This gives us what we want: 
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See also: Dynamical Systems in Mathematical Physics: 
An Illustration from Water Waves; Large Deviations in 
Equilibrium Statistical Mechanics; Wulff Droplets. 


Further Reading 


Schonmann RH and Shlosman. S (1998) Wulff droplets and the 
metastable relaxation of the kinetic Ising models. Communi- 
cations in Mathematical Physics 194: 389-462. 


In what follows we give a quick tour through 
many of the classical results in the field of minimal 
submanifolds, starting at the definition. 

The field of minimal surfaces remains extremely 
active and has very recently seen major develop- 
ments that have solved many longstanding open 
problems and conjectures; for more on this, see the 
expanded version of this survey (Colding and 
Minicozzi II, 2005). See also the recent surveys 
(Meeks III and Perez 2004, Perez 2005), and the 
expository article (Colding and Minicozzi II 2003). 

Throughout this survey, we refer to Colding and 
Minicozzi II (1999) for references unless otherwise 
noted. 


Part 1. Classical and Almost 
Classical Results 


Let X C R” be a smooth k-dimensional submanifold 
(possibly with boundary) and Cj (NX) the space of 
all infinitely differentiable, compactly supported, 
normal vector fields on X. Given ® in C (NX), 
consider the one-parameter variation 


Xe = {x +t O(x)|x € dX} [1] 


The so-called first variation formula of volume is the 
equation (integration is with respect to d(vol) 


d 


dt 


Vol(X;) = J (©, H) i2 


t=0 JX 


where H is the mean curvature (vector) of X. (When 
X is noncompact, then X, in [2] is replaced by 


Fo, where [ is any compact set containing the 
support of 9.) The submanifold X is said to be a 
“minimal” submanifold (or just minimal) if 


Li Vol(3,5)— O0 forall 4 € Cy(NX) [3] 
dt|, o 

or, equivalently by [2], if the mean curvature H is 
identically zero. Thus, X is minimal if and only if it 
is a critical point for the volume functional. (Since a 
critical point is not necessarily a minimum, the term 
“minimal” is misleading, but it is time honored. The 
equation for a critical point is also sometimes called 
the Euler-Lagrange equation.) 

Suppose now, for simplicity, that X is an oriented 
hypersurface with unit normal my. We can then 
write a normal vector field ® € Cr (NX) as ®= ġny, 
where function ¢ is in the space C$ (2X2) of infinitely 
differentiable, compactly supported functions on X. 


Using this, a computation shows that if X is 
minimal, then 
T 
Tal Volo.) = ~f bE 4j 
 It-0 
where 
Lyó = Aso + Al [5] 


is the second variational (or Jacobi) operator. Here, 
Ax is the Laplacian on X and A is the second 
fundamental form. So |A| = Ki + KL KR 
where &k1,...,*,.,1 are the principal curvatures of 
X and H — (&4 +--+ && 4) fts. A minimal submani- 
fold X is said to be stable if 


2 


d Vol(3,5) 20 forall € Cy(NX) [6] 


t=0 


Integrating by parts in [4], we see that stability is 
equivalent to the so-called stability inequality 


fiare x f var [7 


More generally, the *Morse index" of a minimal 
submanifold is defined to be the number of negative 
eigenvalues of the operator L. Thus, a stable 
submanifold has Morse index zero. 


The Gauss Map 


Let X? C R? be a surface (not necessarily mini- 
mal). The Gauss map is a continuous choice of a 
unit normal m: X S c R?. Observe that there 
are two choices of such a map n and —n 
corresponding to a choice of orientation of X. If 
X is minimal, then the Gauss map is an (anti) 
conformal map since the eigenvalues of the 
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Weingarten map are kı and «k2 = —k 1. Moreover, 


for a minimal surface 
|A| = k? + KŻ = —2 k1 k2 = —2 Ky [8] 


where Ks is the Gauss curvature. It follows that the 
area of the Gauss map is a multiple of the total 
curvature. 


Minimal Graphs 


Suppose that 4:2 C R^ — R is a C? function. The 
graph of u 


Graph, = {(x,y,u(x,y)) | (x,y) € Q} [9] 


has area 


Area(Graph,,) = J (1,0, ux) x (0, 1, uy)| 
Q 


= [heme 

Q 

= / y 1 4 [Vul [10] 
JQ 


and the (upward pointing) unit normal is 


(1,0,u) x (0, 1,,) _ (üx; —tty, 1) 


(1, 0, ux) X (0, 1, uy)| is [Vu]? 


Therefore, for the graphs Graph, ,,,, where 7|0Q — 0, 
we get that 


Area(Graph, ,,,) = J V14|Vu+tYn? — [12] 
0 


[11] 


Hence 
d 
5, Area(Graph, n) 
(Vu, Vn) =- | : Vu 
.[-—- == P ule ees] Tm 
124 |Vu[ 2 \/1+|Vul? 


It follows that the graph of z is a critical point for 
the area functional if and only if u satisfies the 
divergence form equation 


dirl —* _. Jan [14 


\/1+ [Vul 


Next we want to show that the graph of a 
function on Q satisfying the minimal surface 
equation, that is, satisfying [14], is not just a critical 
point for the area functional but is actually 
area minimizing amongst surfaces in the cylinder 
Qx Rc R?. To show this, extend first the unit 
normal n of the graph in [11] to a vector field, still 
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denoted by n, on the entire cylinder Q x R. Let w be 
the 2-form on Q x R given that for X, Y € R? 


w(X, Y) = det(X, Y, n) [15] 


An easy calculation shows that 


d= 2| 
den \/ 1+ |Vul? 
EX | -——. es [16] 


ay V 1+ [Vel 


since u satisfies the minimal surface equation. In 
sum, the form w is closed and, given any X and Y at 
a point (x, y, z), 


lu(X, Y)| € |X x Y| [17] 
where equality holds if and only if 


AY C Tes usas) Graph, [18] 


Such a form w is called a “calibration.” From this, 
we have that if © C Q x R is any other surface with 
0X = ð Graph,, then by Stokes’ theorem since w is 
closed, 


Area(Graph,) = / 


i fo < Area(X) [19] 
J Graph, JE 


This shows that Graph, is area minimizing among 
all surfaces in the cylinder and with the same 
boundary. If the domain Q is convex, the minimal 
graph is absolutely area minimizing. To see this, 
observe first that if Q is convex, then so is Q x R and 
hence the nearest point projection P: RP? > x R is 
a distance nonincreasing Lipschitz map that is equal 
to the identity on Q x R. If X c R? is any other 
surface with 0X-—0Graph,, then X'—P(X) has 
Area(»’) € Area(X). Applying [19] to X/, we see 
that Area(Graph,) € Area()’) and the claim 
follows. 

If Q c R? contains a ball of radius r, then, since 
OB, Graph, divides 0B, into two components at 
least one of which has area at most equal to 
(Area($^) /2)*, we get from [19] the crude estimate 


Area(S7) 
2 


When the domain Q is convex, it is not hard to see 
that the minimal graph is absolutely area minimizing. 

Very similar calculations to the ones above show 
that if Q C R”! and u:Q— R is a C? function, then 
the graph of u is a critical point for the area 
functional if and only if u satisfies [14]. Moreover, 
as in [19], the graph of u is actually area 


Area(B, N Graph, ) < r^ [20] 


minimizing. Consequently, as in [20], if Q contains 
a ball of radius r, then 


Vol(S" ^!) 


Vol(B, n Graph, ) < 5 


po [21] 


The Maximum Principle 


The first variation formula, [2], showed that a smooth 
submanifold is a critical point for area if and only if 
the mean curvature vanishes. We will next derive the 
weak form of the first variation formula which is the 
basic tool for working with “weak solutions" (typi- 
cally, stationary varifolds). Let X be a vector field on 
R”. We can write the divergence div s X of X on X as 


div» X = divs X! + div y XN 
= divy X! + (X, H) [22] 


where X" and XN are the tangential and normal 
projections of X. In particular, we get that, for a 
minimal submanifold, 


div x X = divg X! [23] 


Moreover, from [22] and Stokes’ theorem, we see that 
X is minimal if and only if for all vector fields X with 
compact support and vanishing on the boundary of X, 


/ dive X= 0 24 


The key point is that [24] makes sense as long as we 
can define the divergence on X. As a consequence of 
[24], we will show the following proposition: 


Proposition 1 X c R” is minimal if and only if the 
restrictions of the coordinate functions of R" to X 
are barmonic functions. 


Proof Let 7 be a smooth function on X with 
compact support and 7|O — 0, then 


[ (ven Vsti} = f venei 
Js X 


= f div (nei) [25] 
JS 


From this, the claim follows easily. [] 


Recall that if = C R" is a compact subset, then the 
smallest convex set containing = (the convex hull, 
Conv(&)) is the intersection of all half-spaces 
containing =. The maximum principle forces a 
compact minimal submanifold to lie in the convex 
hull of its boundary (this is the “convex hull 
property”): 


Proposition 2 If X^ C R" is a compact minimal 
submanifold, then 3 C Conv(0X.). 
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Proof A half-space H C R” can be written as 
H = {x € R"|(x,e) < a} [26] 


for a vector ec S"! and constant ac R. By 
Proposition 1, the function z(x) — (e, x) is harmonic 
on X and hence attains its maximum on OX. by the 
maximum principle. [] 


Another application of [23], with a different 
choice of vector field X, gives that for a 
k-dimensional minimal submanifold X 


Ax|x—xo —2divs(x—xo) 22k [27] 


Later, we will see that this formula plays a crucial 
role in the monotonicity formula for minimal 
submanifolds. 

The argument in the proof of the convex hull 
property can be rephrased as saying that as we 
translate a hyperplane towards a minimal surface, 
the first point of contact must be on the boundary. 
When X is a hypersurface, this is a special case of 
the strong maximum principle for minimal surfaces: 


Lemma 1 Let QCR”! be an open connected 
neighborhood of the origin. If u1,u5 :0 — R. are 
solutions of the minimal surface equation with uy < uz 
and u1(0) — u3(0), then u4 = up. 


Since any smooth hypersurface is locally a graph 
over a hyperplane, Lemma 1 gives a maximum 
principle for smooth minimal hypersurfaces. 

Thus far, the examples of minimal submanifolds 
have all been smooth. The simplest nonsmooth 
example is given by a pair of planes intersecting 
transversely along a line. To get an example that is 
not even immersed, one can take three half-planes 
meeting along a line with an angle of 27/3 between 
each adjacent pair. 


Monotonicity and the Mean-Value 
Inequality 


Monotonicity formulas and mean-value inequalities 
play a fundamental role in many areas of geometric 
analysis. 


Proposition 3 Suppose that X* cC R” is a minimal 
submanifold and xo € R”; then for all 0 < s « t, 


t^* Vol(B,(xo) N X) — s^* Vol(B,(xo) N E) 
[æ — xo)^ 
k+2 


J 28 
J(Bi(xo)Mis(xo))nx. |x — xol 


Notice that (x — xo)" vanishes precisely when X is 


conical about xo, that is, when X. is invariant under 


dilations about xo. As a corollary, we get the 
following: 


Corollary 1 Suppose that X* C R” is a minimal 
submanifold and xo € R"; then the function 


. Vol(B,(xo) n X) 


Butt) = 29 
(s) Vol(B, c R*) 29 
is a nondecreasing function of s. Moreover, 
O(s) is constant in s if and only if X is conical 
about xo. 


Of course, if xo is a smooth point of X, then 
lim, — 0 94, (s) = 1. We will later see that the converse 
is also true; this will be a consequence of the Allard 
regularity theorem. 

The monotonicity of area is a very useful tool in 
the regularity theory for minimal surfaces — at least 
when there is some a priori area bound. For 
instance, this monotonicity and a compactness 
argument allow one to reduce many regularity 
questions to questions about minimal cones (this 
was a key observation of W Fleming in his work on 
the Bernstein problem; see the section “The 
theorems of Bernstein and Bers"). 

Arguing as in Proposition 3, we get a weighted 
monotonicity: 


Proposition 4 If X* c R” is a minimal submani- 
fold, xy € R”, and f is a function on X, then 


f go f 
B;(xo)QX B;(xo)9X 


N42 
- J I(x — xo) | 4 jet 
(B,(x9)\Bs(x0))M= |x — 


xo^? 2 2 
x / (7? — |x — xo|^)Azfdr [30] 
B. (xo) 


We get immediately the following mean-value 
inequality for the special case of non-negative 
subharmonic functions: 


Corollary 2 Suppose that X^ C R” is a minimal 
submanifold, xo € R”, and f is a non-negative 
subharmonic function on X; then 


sf | f 31] 
B,(xo)'X 


is a nondecreasing function of s. In particular, if 
xo € È, then for all s > 0, 


f (x9) < In. f 


~ Vol(B, c R^) 32] 
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Rado's Theorem 


One of the most basic questions is what does the 
boundary OX tell us about a compact minimal 
submanifold ©? We have already seen that X must 
lie in the convex hull of 0%, but there are many 
other theorems of this nature. One of the first 
theorems is a beautiful result of Rado which says 
that if OX is a graph over the boundary of a convex 
set in R?, then X is also graph (and hence 
embedded). The proof of this uses basic properties 
of nodal lines for harmonic functions. 


Theorem 1 Suppose that Q C R? is a convex subset 
and o C R? is a simple closed curve which is 
graphical over OQ. Then any minimal disk X c R? 
with O£ =o must be graphical over Q and hence 
unique by the maximum principle. 


Proof (Sketch). The proof is by contradiction, so 
suppose that X is such a minimal disk and x € X is a 
point where the tangent plane to X is vertical. 
Consequently, there exists (a,b) (0,0) such that 


Vx(ax + bx2)(x) zit [33] 


By Proposition 1, ax; + bx» is harmonic on X (since 
it is a linear combination of coordinate functions). 
The local structure of nodal sets of harmonic 
functions (see, e.g., Colding and Méinicozzi II 
(1999)) then gives that the level set 


{y € lax, + bx2(y) = axı + bx2(x)} [34] 


has a singularity at x where at least four different 
curves meet. If two of these nodal curves were to 
meet again, then there would be a closed nodal 
curve which must bound a disk (since X is a disk). 
By the maximum principle, ax; + bx; would have 
to be constant on this disk and hence constant on X 
by unique continuation. This would imply that 
c — OX is contained in the plane given by [34]. 
Since this is impossible, we conclude that all of 
these curves go to the boundary without intersect- 
ing again. 


In other words, the plane in R? given by [34] 
intersects c in at least four points. However, since 
Q c R? is convex, ðN intersects the line given by 
[34] in exactly two points. Finally, since o is 
graphical over OQ, o intersects the plane in R? 
given by [34] in exactly two points, which gives 
the desired contradiction. O 


The Theorems of Bernstein and Bers 


A classical theorem of S Bernstein from 1916 says 
that entire (i.e., defined over all of R?) minimal 


graphs are planes. This remarkable theorem of 
Bernstein was one of the first illustrations of the 
fact that the solutions to a nonlinear PDE, like the 
minimal surface equation, can behave quite differ- 
ently from solutions to a linear equation. 


Theorem 2 If u:R?—R is an entire solution to the 
minimal surface equation, then u is an affine 
function. 


Proof (Sketch). We will show that the curvature of 
the graph vanishes identically; this implies that the 
unit normal is constant and, hence, the graph must 
be a plane. The proof follows by combining two 
facts. First, the area estimate for graphs [20] gives 


Area(B, N Graph, ) < 2r [35] 


This quadratic area growth allows one to construct 
a sequence of non-negative logarithmic cutoff func- 
tions à; defined on the graph with ¢;— 1 every- 
where and 


lim | |v o 36 
Graph, 


j—9o. 


Moreover, since graphs are area minimizing, they 
must be stable. We can therefore use ó; in the 
stability inequality [7] to get 


? "T. 
I o7 lA]? < / IVo;|" [37] 
/ Graph, J Graph, 
Combining these gives that |A| is zero, as 
desired. O 


Rather surprisingly, this result very much 
depended on the dimension. The combined efforts 
of E De Giorgi, F J Almgren Jr., and J Simons finally 
gave: 


Theorem 3 If u:R" ! —R is an entire solution to 
the minimal surface equation and n < 8, then u is an 
affine function. 


However, in 1969, E Bombieri, De Giorgi, and 
E Giusti constructed entire nonaffine solutions to 
the minimal surface equation on R^ and an area- 
minimizing singular cone in R". In fact, they showed 
that for m > 4, the cones 


? 2 
X112 Bm) | X4 44 


m 


PERERA PCR [38] 


are area minimizing (and obviously singular at the 
origin). 

In contrast to the entire case, exterior solutions 
of the minimal graph equation, that is, solutions 


on R^ Bi, are much more plentiful. In this case, L 
Bers proved that Vu actually has an asymptotic 
limit: 

Theorem 4 If u is a C* solution to the minimal 
surface equation on R*\B,, then Vu has a limit at 
infinity (1.e., there is an asymptotic tangent plane). 


Bers’ theorem was extended to higher dimensions 
by L Simon: 


Theorem 5 If u is a C? solution to the minimal 
surface equation on R"\ B,, then either 


(i) |Vu| is bounded and Vu bas a limit at infinity or 
(ii) all tangent cones at infinity are of the form % x R 
where X. is singular. 


Bernstein’s theorem has had many other interest- 
ing generalizations, some of which will be discussed 
later. 


Simons Inequality 


In this section, we recall a very useful differential 
inequality for the Laplacian of the norm squared of 
the second fundamental form of a minimal hypersur- 
face X in R” and illustrate its role in a priori 
estimates. This inequality, originally due to J 
Simons, is: 


Lemma 2 If X" ! c R” is a minimal hypersurface, 
then 


Axl AF = —2|Al* +2|VsAl* > —2)Al* [39] 


An inequality of the type [39] on its own does not 
lead to pointwise bounds on |A|? because of the 
nonlinearity. However, it does lead to estimates if a 
“scale-invariant energy" is small. For example, 
H Choi and Schoen used [39] to prove: 


Theorem 6 There exists € » 0 so that if Oc X c 


B,(0) with 0X C OB,(0) is a minimal surface with 
/ A «« [40] 
then | 
APO) <r? [41] 


Heinz’s Curvature Estimate for Graphs 


One of the key themes in minimal surface theory is 
the usefulness of a priori estimates. A basic example 
is the curvature estimate of E Heinz for graphs. 
Heinz's estimate gives an effective version of the 
Bernstein's theorem; namely, letting the radius ro go 
to infinity in [42] implies that |A| vanishes, thus 
giving Bernstein's theorem. 
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Theorem 7 If D, C R^ and u:D,, — R satisfies 
the minimal surface equation, then for X = Graph, 
and 0 < o < ro 


c? sup |A < C [42] 


Proof (Sketch). Observe first that it suffices to 
prove the estimate for o = ro, that is, to show that 


|A[ (0, 4(0)) < Cro? [43] 


Recall that minimal graphs are automatically stable. 
As in the proof of Theorem 2, the area estimate for 
graphs [20] allows us to use a logarithmic cutoff 
function in the stability inequality [7] to get that 


iL 
[wre 
J By, Graph, log(ro/T1) 


Taking ro/r; sufficiently large, we can then apply 
Theorem 6 to get [43]. | ai 


[44] 


Embedded Minimal Disks 
with Area Bounds 


In the early 1980s, Schoen and Simon extended the 
theorem of Bernstein to complete simply connected 
embedded minimal surfaces in R? with quadratic 
area growth. A surface X is said to have quadratic 
area growth if for all r > 0, the intersection of the 
surface with the ball in R? of radius r and center at 
the origin is bounded by Cr? for a fixed constant C 
independent of r. 


Theorem 8 Let 0€ X? C Bj, =B,,(x) C R? be an 
embedded simply connected minimal surface witb 
OS C OB,,. If p > 0 and either 


Area(©) € ur; or | lA <p [45] 


then for the connected component Y/ of B,,/5(xo) N X 
with 0 € X' we have 


sup|A[ < Cro” [46] 


for some C=C(p1). 


The result of Schoen-Simon was generalized by 
Colding-Minicozzi to quadratic area growth for 
intrinsic balls (this generalization played an impor- 
tant role in analyzing the local structure of 
embedded minimal surfaces): 


Theorem 9 Given a constant Cj, there exists Cp so 
that if B2, C Xi C R^ is an embedded minimal disk 
satisfying eitber 


Area(B5,,) € Cin, or | lA « Cr i47] 
B 


? 
- 
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then 


sup |A < Cps~? [48] 
B, 


As an immediate consequence, letting r9 — oo 
gives Bernstein-type theorems for embedded simply 
connected minimal surfaces with either bounded 
density or finite total curvature. Note that Enneper's 
surface is simply connected but neither flat nor 
embedded; this shows that embeddedness is essential 
for these estimates. Similarly, the catenoid shows 
that the surface being simply connected is essential. 
The catenoid is the minimal surface in R? given by 


{(coshs cost, cosh ssin t, s)|s,? € R} [49] 


Stable Minimal Surfaces 


It turns out that stable minimal surfaces have a 
priori estimates. Since minimal graphs are stable, the 
estimates for stable surfaces can be thought of as 
generalizations of the earlier estimates for graphs. 
These estimates have been widely applied and are 
particularly useful when combined with existence 
results for stable surfaces (such as the solution of the 
Plateau problem). The starting point for these 
estimates is that, as we saw in [4], stable minimal 
surfaces satisfy the stability inequality 


faes [ wo? [50 


We will mention two such estimates. The first is 
R Schoen's curvature estimate for stable surfaces: 


Theorem 10 There exists a constant C so that if 
X c R? is an immersed stable minimal surface with 
trivial normal bundle and B,, C XNOX, then 


sup |A < Co? [51] 
Bro -c 


The second is an estimate for the area and total 
curvature of a stable surface is due to Colding- 
Minicozzi; for simplicity, we will state only the area 
estimate: 


Theorem 11 If XE cR? is an immersed stable 


minimal surface with trivial normal bundle and 
B,, C XNOX,, then 


Area(B,,) < 4273/3 [52] 


As mentioned, we can use [52] to bound the 
energy of a cutoff function in the stability inequality 
and, thus, bound the total curvature of sub-balls. 
Combining this with the curvature estimate of 
Theorem 6 gives Theorem 10. Note that the bound 


[53] is surprisingly sharp; even when X is a plane, 
the area is 779. 


Regularity Theory 


In this section, we survey some of the key ideas in 
classical regularity theory, such as the role of 
monotonicity, scaling, e-regularity theorems (such 
as Allard’s theorem) and tangent cone analysis (such 
as Almgren’s refinement of Federer’s dimension 
reducing). We refer to the book by Morgan (1995) 
for a more detailed overview and a general 
introduction to geometric measure theory. 

The starting point for all of this is the mono- 
tonicity of volume for a minimal k-dimensional 
submanifold X. Namely, Corollary [1] gives that the 
density 


u Vol(B,(xo) AE) 


Ows) = Vol(B, c R*) 53 


is a monotone nondecreasing function of s. Conse- 
quently, we can define the density Ox, at the point 
xo to be the limit as s— 0 of O,,(s). It also follows 
easily from monotonicity that the density is semi- 
continuous as a function of xo. 


c-Regularity and the Singular Set 


An e-regularity theorem is a theorem giving that a 
weak (or generalized) solution is actually smooth at 
a point if a scale-invariant energy is small enough 
there. The standard example is the Allard regularity 
theorem: 


Theorem 12 There exists 6(k,n) > 0 such that if 
X C R” is a k-rectifiable stationary varifold (with 
density at least one a.e.), xo € X, and 


WES lim VOM Br (0) (13) 


i gS 1a [54] 
r20 Vol(B, c R^) 


then X is smooth in a neighborhood of xo. 


Similarly, the small total curvature estimate of 
Theorem 6 may be thought of as an c-regularity 
theorem; in this case, the scale-invariant energy is 
far. 

As an application of the e-regularity theorem, 
Theorem [12], we can define the singular set S of X by 


$-Íxe€Xl0.21-45) [55] 


It follows immediately from the semicontinuity of 
the density that S is closed. In order to bound the 
size of the singular set (e.g., the Hausdorff measure), 
one combines the c-regularity with simple covering 
arguments. 


This preliminary analysis of the singular set can 
be refined by doing a so-called tangent cone 
analysis. 


Tangent Cone Analysis 


It is not hard to see that scaling preserves the space 
of minimal submanifolds of R". Namely, if X is 
minimal, then so is 


Sya = {y +A (x-y) E} — [6 


(To see this, simply note that this scaling multi- 
plies the principal curvatures by A.) Suppose now 
that we fix the point y and take a sequence A; — 0. 
The monotonicity formula bounds the density of 
the rescaled solution, allowing us to extract a 
convergent subsequence and limit. This limit, 
which is called a *tangent cone" at y, achieves 
equality in the monotonicity formula and, hence, 
must be homogeneous (i.e., invariant under dila- 
tions about y). 

The usefulness of tangent cone analysis in 
regularity theory is based on two key facts. For 
simplicity, we illustrate these when X C R” is an 
area-minimizing hypersurface. First, if any tangent 
cone at y is a hyperplane R” !, then X is smooth in a 
neighborhood of y. This follows easily from the 
Allard regularity theorem since the density at y of 
the tangent cone is the same as the density at y of X. 
The second key fact, known as “dimension redu- 
cing," is due to Almgren and is a refinement of an 
argument of Federer. To state this, we first stratify 
the singular set S of X into subsets 


So C 81 C- C Sn [57] 


where we define S; to be the set of points y € S so 
that any linear space contained in any tangent cone 
at y has dimension at most i. (Note that S, | =@ by 
Allard's theorem.) The dimension reducing argu- 
ment then gives that 


dim(S;) € i [58] 


where dimension means the Hausdorff dimension. 
In particular, the solution of the Bernstein problem 
then gives codimension-7 regularity of X, that is, 
dim (S) € » — 8. 


Part 2. Constructing Minimal Surfaces 


Thus far, we have mainly dealt with regularity and 
a priori estimates but have ignored questions of 
existence. In this part, we survey some of the most 
useful existence results for minimal surfaces. The 
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following section gives an overview of the classical 
Plateau problem. Next, we recall the classical 
Weierstrass representation, including a few modern 
applications, and the Kapouleas desingularization 
method. Then we deal with producing area-mini- 
mizing surfaces and questions of embeddedness. 
Finally, we recall the min-max construction for 
producing unstable minimal surfaces and, in parti- 
cular, doing so while controlling the topology and 
guaranteeing embeddedness. 


The Plateau Problem 


The following fundamental existence problem for 
minimal surfaces is known as the Plateau problem: 
given a closed curve I’, find a minimal surface with 
boundary FI. There are various solutions to this 
problem depending on the exact definition of a 
surface (parametrized disk, integral current, Z2 
current, or rectifiable varifold). We shall consider 
the version of the Plateau problem for parametrized 
disks; this was solved independently by J Douglas 
and T Rado. The generalization to Riemannian 
manifolds is due to C B Morrey. 


Theorem 13 Let T C R? be a piecewise C! closed 
Jordan curve. Then there exists a piecewise C! map 
u from D C R? to R? with u(0D) CT such that the 
image minimizes area among all disks witb bound- 
ary T. 


The solution z to the Plateau problem above can 
easily be seen to be a branched conformal immer- 
sion. R Osserman proved that u does not have true 
interior branch points; subsequently, R Gulliver and 
W Alt showed that uw cannot have false branch 
points either. 

Furthermore, the solution 4 is as smooth as the 
boundary curve, even up to the boundary. A very 
general version of this boundary regularity was 
proved by S Hildebrandt; for the case of surfaces 
in R?, recall the following result of J C C Nitsche: 


Theorem 14 If T is a regular Jordan curve of class 
C^^ where k > 1 and 0 € a « 1, then a solution 7 
of the Plateau problem is C^^ on all of D. 


The Weierstrass Representation 


The classical Weierstrass representation (see Osserman 
(1986)) takes holomorphic data (a Riemann surface, a 
meromorphic function, and a holomorphic 1-form) 
and associates a minimal surface in R?. To be precise, 
given a Riemann surface Q, a meromorphic function g 
on Q, and a holomorphic 1-form ó on Q, then we 
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get a (branched) conformal minimal immersion 


F:Q—R? by 


;& (O*sQ.1)e() — [59 


Here, zo € 2 is a fixed base point and the integra- 
tion is along a path ^; . from zo to z. The choice of 
zo changes F by adding a constant. In general, the 
map F may depend on the choice of path (and hence 
may not be well defined); this is known as “the 
period problem." However, when g has no zeros or 
poles and €) is simply connected, then F(z) does not 
depend on the choice of path »;, ;. 

Two standard constructions of minimal surfaces 
from Weierstrass data are 


g(z) — z, (z) = dz/z, € = C\{0} 
giving a catenoid [60] 


g(z) =e", ó(z) = dz, Q = C giving a helicoid [61] 


The Weierstrass representation is particularly 
useful for constructing immersed minimal surfaces. 
Typically, it is rather difficult to prove that the 
resulting immersion is an embedding (i.e., is 1-1), 
although there are some interesting cases where this 
can be done. For the first modern example, 
D Hoffman and Meeks proved that the surface 
constructed by Costa was embedded; this was 
the first new complete finite topology properly 
embedded minimal surface discovered since the 
classical catenoid, helicoid, and plane. This led 
to the discovery of many more such surfaces 
(see Rosenberg (1992) for more discussion). 


Area-Minimizing Surfaces 


Perhaps the most natural way to construct minimal 
surfaces is to look for ones which minimize area, for 
example, with fixed boundary, or in a homotopy 
class, etc. This has the advantage that often it is 
possible to show that the resulting surface is 
embedded. We mention a few results along these 
lines. 

The first embeddedness result, due to Meeks and 
Yau, shows that if the boundary curve is embedded 
and lies on the boundary of a smooth mean convex 
set (and it is null-homotopic in this set), then it 
bounds an embedded least area disk. 


Theorem 15 (Meeks III and Yau 1982). Let M? be 
a compact Riemannian 3-manifold whose boundary 
is mean convex and let ^; be a simple closed curve in 


OM which is null-homotopic in M; then ^ is 
bounded by a least area disk and amy such least 
area disk is properly embedded. 


Note that some restriction on the boundary curve 
y is certainly necessary. For instance, if the 
boundary curve was knotted (e.g., the trefoil), then 
it could not be spanned by any embedded disk 
(minimal or otherwise). Prior to the work of Meeks 
and Yau, embeddedness was known for extremal 
boundary curves in R? with small total curvature by 
the work of R Gulliver and J Spruck. 

If we instead fix a homotopy class of maps, then 
the two fundamental existence results are due to 
Sacks-Uhlenbeck and Schoen-Yau (with embed- 
dedness proved by Meeks-Yau and Freedman- 
Hass-Scott, respectively): 


Theorem 16 Given M, there exist conformal 
(stable) minimal immersions 14,...5Um:S°—M 
which generate 72(M) as a Z|m1(M)] module. 
Furthermore, 


(i) if w:S M and [wu], 40, then Area(u) > 
min; Area(z;), 

(ii) each uj is either an embedding or a 2-1 map 
onto an embedded two-sided RP". 


Theorem 17 If X? is a closed surface with genus 
g»0 and ij:3— M? is an embedding which 
induces an injective map on mı, then there is a 
least area embedding with the same action on mı. 


The Min-Max Construction 
of Minimal Surfaces 


Variational arguments can also be used to construct 
higher index (i.e., nonminimizing) minimal surfaces 
using the topology of the space of surfaces. There 
are two basic approaches: 


1. Applying Morse theory to the energy functional 
on the space of maps from a fixed surface X to M. 

2. Doing a min-max argument over families of 
(topologically nontrivial) sweep-outs of M. 


The first approach has the advantage that the 
topological type of the minimal surface is easily 
fixed; however, the second approach has been more 
successful at producing embedded minimal surfaces. 
We will highlight a few key results below but refer 
to Colding and De Lellis (2003) for a thorough 
treatment. 

Unfortunately, one cannot directly apply Morse 
theory to the energy functional on the space of maps 
from a fixed surface because of a lack of compact- 
ness (the Palais-Smale condition C does not hold). 
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Figure 1 A one-parameter family of curves on a 2-sphere 
which induces a map F : S? — S? of degree 1. First published in 
Surveys in Differential Geometry, volume IX, in 2004, published 
by International Press. 


To get around this difficulty, Sacks-Uhlenbeck 
introduce a family of perturbed energy functionals 
which do satisfy condition C and then obtain 
minimal surfaces as limits of critical points for the 
perturbed problems: 


Theorem 18 If «z,(M) Z0 for some k » 1, then 
there exists a branched immersed minimal 2-sphere 
in M (for any metric). 


The basic idea of constructing minimal surfaces 
via min-max arguments and sweep-outs goes back 
to Birkhoff, who developed it to construct simple 
closed geodesics on spheres. In particular, when M is 
a topological 2-sphere, we can find a one-parameter 
family of curves starting and ending at point curves 
so that the induced map F:S*—S* (see Figure 1) 
has nonzero degree. The min-max argument pro- 
duces a nontrivial closed geodesic of length less than 
or equal to the longest curve in the initial one- 
parameter family. A curve-shortening argument 
gives that the geodesic obtained in this way is 
simple. 

] Pitts applied a similar argument and geometric 
measure theory to get that every closed Riemannian 
3-manifold has an embedded minimal surface (his 
argument was for dimensions up to seven), but he 
did not estimate the genus of the resulting surface. 
Finally, F Smith (under the direction of L Simon) 
proved (see Colding and De Lellis (2003)): 


Theorem 19 Every metric on a topological 
3-sphere M admits an embedded minimal 2-sphere. 


The main new contribution of Smith was to 
control the topological type of the resulting minimal 
surface while keeping it embedded. 


Part 3. Some Applications of Minimal 
Surfaces 


In this part, we discuss very briefly a few applica- 
tions of minimal surfaces. As mentioned in the 
introduction, there are many to choose from and we 
have selected just a few. 
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The Positive-Mass Theorem 


The (Riemannian version of the) positive-mass 
theorem states that an asymptotically flat 
3-manifold M with non-negative scalar curvature 
must have positive mass. The Riemannian manifold 
M here arises as a maximal spacelike slice in a 
(3 + 1)-dimensional spacetime solution of Einstein’s 
equations. 

The asymptotic flatness of M arises because the 
spacetime models an isolated gravitational system 
and hence is a perturbation of the vacuum solution 
outside a large compact set. To make this precise, 
suppose for simplicity that M has only one end; M 
is then said to be asymptotically flat if there is a 
compact set Q C M so that M\Q is diffeomorphic 
to R? \ Br(0) and the metric on M\Q can be 
written as 


MM 
ij = ee A ee 2 
Sij ur Oi + Di [62] 
where 
xl pyl + xl Dp; + lx ID p; € C — [63] 


The constant M is the so-called mass of M. Observe 
that the metric gj; is a perturbation of the metric on 
a constant-time slice in the Schwarzschild spacetime 
of mass M; that is to say, the Schwarzschild metric 
has Di = 0. 

A tensor þh is said to be O(|x| ^) if |x|^|b| + 
Ix *' IDb| < C. For example, an easy calculation 
shows that 


gi = (1+2M/|x|) 6; + O(|x| ^) 
Vg = 1/ det gj = 1+3M|x|~' + O(|x| ?) 


The positive-mass theorem states that the mass M 
of such an M must be non-negative: 


Theorem 20 (Schoen and Yau 1979). 
above, M > 0. 


[64] 


With M as 


There is a rigidity theorem as well which states that 
the mass vanishes only when M is isometric to R*: 


Theorem 21 (Schoen and Yau 1979). If |V°p;i| = 
O(lx| ?) and M=0 in Theorem 20, then M is 
isometric to R*. 


We will give a very brief overview of the proof of 
Theorem 20, showing in the process where minimal 
surfaces appear. 


Proof (Sketch). The argument will be by contra- 
diction, so suppose that the mass is negative. It is 
not hard to prove that the slab between two parallel 
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planes is mean convex. That is, we have the 
following: 


Lemma 3 If M <0 and M is asymptotically flat, 
then there exist Ro, b > 0 so that for r > Ro the sets 


C, = (Ix < ?,—b < x3 € b) [65] 
have strictly mean-convex boundary. 


Since the compact set C, is mean convex, we can 
solve the Plateau problem to get an area-minimizing 
(and hence stable) surface r, C C, with boundary 


ar, = {|x =r,x3 = b) [66] 


Using the disk [|x|^ <7,x3=h} as a comparison 
surface, we get uniform local area bounds for any 
such I’,. Combining these local area bounds with the 
a priori curvature estimates for minimizing surfaces, 
we can take a sequence of r’s going to infinity and 
find a subsequence of Is that converge to a 
complete area-minimizing surface 


Cc {-h<x3 <h} 167] 


Since I is pinched between the planes {x3 = +h}, the 
estimates for minimizing surfaces implies that (out- 
side a large compact set) I is a graph over the plane 
(x3 —0] and hence has quadratic area growth and 
finite total curvature. Moreover, using the form of 
the metric gj, we see that |Vz| decays like Ix| * and 


J k, = (21s + O(1))(s ! + O(s ?)) 
= 2r + O(s^!) [68] 


where c, = (x? +x3=s*}MT and kg is the geodesic 
curvature of o, (as a curve in I). 

To get the contradiction, one combines stability of 
I with the positive scalar curvature of M to see that 
no such T could have existed. (M was assumed only 
to have non-negative scalar curvature. However, a 
"rounding off" argument shows that the metric on 
M can be perturbed to have positive scalar curvature 
outside of a compact set and still have negative 
mass.) Namely, substituting the Gauss equation into 
the stability inequality (this is the stability inequality 
in a general 3-manifold; see Colding and Minicozzi II 
(1999)) gives 


f (na + Sealy — Ky)¢* < | Vol? [69] 
p r 


Since I has quadratic area growth, we can choose a 
sequence of (logarithmic) cutoff functions in [69] to 
get 


0< f (AP /2 + Scaly) < [ Ks «oo (70 
x p 


a 


since Kx may not be positive, we also used that T 
has finite total curvature. Moreover, we used that 
Scaly is positive outside a compact set to see that 
the first integral in [70] was positive. Finally, 
substituting [70] into the Gauss-Bonnet formula 
gives that f, kg is strictly less than 27 for s large, 
contradicting [68 ]. 


Black holes 


Another way that minimal surfaces enter into 
relativity is through black holes. Suppose that we 
have a three-dimensional time slice M in a (3 + 1)- 
dimensional spacetime. For simplicity, assume that M 
is totally geodesic and hence has non-negative scalar 
curvature. A closed surface X in M is said to be 
trapped if its mean curvature is everywhere negative 
with respect to its outward normal. Physically, this 
means that the surface emits an outward shell of light 
whose surface area is decreasing everywhere on the 
surface. The existence of a closed trapped surface 
implies the existence of a black hole in the spacetime. 

Given a trapped surface, we can look for the 
outermost trapped surface containing it; this outer- 
most surface is called an apparent horizon. It is not 
hard to see that an apparent horizon must be a 
minimal surface and, moreover, a barrier argument 
shows that it must be stable. Since M has non- 
negative scalar curvature, stability in turn implies 
that it must be diffeomorphic to a sphere. See, for 
instance, Bray (2002) for references to some results 
on black holes, horizons, etc. 


Constant Mean Curvature Surfaces 


At least since the time of Plateau, minimal surfaces 
have been used to model soap films. This is because 
the mean curvature of the surface models the surface 
tension and this is essentially the only force acting 
on a soap film. Soap bubbles, on other hand, enclose 
a volume and thus the pressure gives a second 
counterbalancing force. It follows easily that these 
two forces are in equilibrium when the surface has 
constant mean curvature (cmc). 

For the same reason, cmc surfaces arise in the 
isoperimetric problem. Namely, a surface that mini- 
mizes surface area while enclosing a fixed volume must 
have cme. It is not hard to see that such an 
isoperimetric surface in R" must be a round sphere. 
There are two interesting partial converses to this. 
First, by a theorem of Hopf, any cme 2-sphere in R? 
must be round. Second, using the maximum principle 
(*the method of moving planes"), Alexandrov showed 
that any closed embedded cmc hypersurface in R” 
must be a round sphere. It turned out, however, that 
not every closed immersed cmc surface is round. The 
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The min-max surface 


Figure 2 The sweep-out, the min-max surface, and the width 
W. First published in the Journal of the American Mathematical 
Society in 2005, published by the American Mathematical Society. 


first examples were immersed cmc tori constructed by 
H Wente. Kapouleas constructed many new examples, 
including closed higher-genus cmc surfaces. 

Many of the techniques developed for studying 
minimal surfaces generalize to general cmc surfaces. 


Finite Extinction for Ricci Flow 


We close this article by indicating how minimal 
surfaces can be used to show that on a homotopy 
3-sphere the Ricci flow becomes extinct in finite 
time (see Colding and Minicozzi II (2005) and 
Perelman (2003) for details). 

Let M? be a smooth closed orientable 3-manifold 
and let g(t) be a one-parameter family of metrics on 
M evolving by the Ricci flow, so 


g = —2RiCy, [71] 


In an earlier section, we saw that there is a natural 
way of constructing minimal surfaces on many 
3-manifolds and that comes from the min-max 
argument where the minimal of all maximal slices of 
sweep-outs is a minimal surface. The idea is then to 
look at how the area of this min-max surface changes 
under the flow. Geometrically, the area measures a 
kind of width of the 3-manifold and as we will see for 
certain 3-manifolds (those, like the 3-sphere, whose 
prime decomposition contains no aspherical factors), 
the area becomes zero in finite time corresponding to 
the solution becoming extinct in finite time. 

Fix a continuous map £8: [0,1] ^ C? N L*(S^, M) 
where (0) and 9(1) are constant maps so that is 
in the nontrivial homotopy class [8] (such 8 exists 
when M is a homotopy 3-sphere). We define the 
width W = W(g,|8]) by 


W(g) = min max Energy(7(s)) [72] 
ye] s€[0.1] 


The next theorem gives an upper bound for the 
derivative of W(g(t)) under the Ricci flow which forces 
the solution g(t) to become extinct in finite time. 


Theorem 22 Let M? be a homotopy 3-sphere 
equipped with a Riemannian metric g — g(0). 
Under the Ricci flow, the width W(g(t)) satisfies 
d 3 
— W < — 4r 4-——— 73 
g We) -m+ ge) 73 
in the sense of the limsup of forward difference 
quotients. Hence, g(t) must become extinct in finite 
time. 


The 4r in [73] comes from the Gauss-Bonnet 
theorem and the 3/4 comes from the bound on the 
minimum of the scalar curvature that the evolution 
equation implies. Both of these constants matter 
whereas the constant C depends on the initial metric 
and the actual value is not important. 

To see that [73] implies finite extinction time, 
rewrite [73] as 


(Wet) »- ^) 
< Art + og [74] 


&|e- 


and integrate to get 
(T + C) ?^ W(g(T)) < C?^W(e(0)) 
" 16x (T jeu — cue [75] 


Since W > 0 by definition and the right-hand side of 
[75] would become negative for T sufficiently large, 
we get the claim. 

As a corollary of this theorem we get finite 
extinction time for the Ricci flow. 


Corollary 3 Let M? be a homotopy 3-sphere 
equipped with a Riemannian metric g — g(0). Under 
the Ricci flow g(t) must become extinct in finite time. 
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Introduction 


When studying a functional f on an infinite- 
dimensional function space X, one is often interested 
in finding critical points which are not local minima. 
A simple yet powerful method to detect those 
critical points is the minimax method. The idea 
consists in detecting some complexity in the topol- 
ogy of X, or in the structure of the sublevels of f, to 
find a class I of subsets of X which somehow 
reveals such a topological complexity, and to show 
that the number 


is finite (even if the functional may be unbounded 
above and below). If the class I is positively 
invariant under the action of the negative-gradient 
flow of /, and if a suitable compactness assumption 
known as the Palais-Smale condition holds, c is 
proved to be a critical value of f. Quite remarkably, 
the minimax method also works when no topologi- 
cal complexity is present, but the negative-gradient 
flow of f exhibits some kind of rigidity. 

In this article we shall describe these ideas, 
starting from the simplest minimax result, the 
"mountain-pass theorem." We will show how to 


apply the minimax method by discussing the 
existence question of solutions of a nonlinear elliptic 
boundary value problem, of closed geodesics on 
compact manifolds, and of closed characteristics on 
compact energy hypersurfaces. 


The Mountain-Pass Theorem 


Let us start by considering the following familiar 
fact. Let f: R" — R be a smooth coercive function 
(1.e., its sublevels have compact closure). If a sublevel 
(f < a] is not connected - say {f < a] - U B, with 
A, B disjoint open sets — then f has a critical point x at 
level 


where T is the class of all continuous curves in R” 
with one end point in A and the other in B. More 
figuratively: if there are two valleys, then there 
must be a mountain pass. Let us examine a possible 
proof. 

First notice that any curve in the class T will have 
to cross the level (f =a}, so c > a. If by contradiction 
c is not a critical value of f, by the compactness of the 
sublevels there is some e > 0 such that |Vf| > € on 
(c—c €f €c-«J. Then the negative-gradient flow 
of f, that is, the solution of 


Oplt u) = —Vf(d(t,u)), (0,4) =u 


pulls the sublevel {f < c+ e} down into the sublevel 
(f € c — ej in finite time 2/e. Indeed, if o([0, t], u) C 
(c— e € f € c- ej, then the inequalities 


2e > f(u) — f(o(t,u)) 
td 
- -] E f(é(s.u)) ds 


= f Ivfióis. uit ds > e 
0 


imply that t < 2/e. By definition of c, we can find a 
continuous curve y € I which is contained in (f < 
c + €}. But then the curve 7/:= $(2/e, y) still has one 
end point in A, the other one in B, and lies in [f < 
c — €], contradicting the definition of c. 

If we try to generalize this result to functions 
defined on an infinite-dimensional real Hilbert space 
H, we encounter difficulties due to lack of compact- 
ness. Indeed, a continuous function on an infinite- 
dimensional Hilbert space can never have compact 
sublevels (with respect to the norm topology). If we 
look back at the proof, we see that we have used 
coercivity to guarantee that if the level set (f = c] 
contains no critical points, then Vf is bounded away 
from zero on the strip (c — € € f < c+ e], for some 
small e > 0. A natural idea is then to replace the 
coercivity assumption by a condition implying the 
latter fact. 


Definition Let /: H — R be a continuously differ- 
entiable function on a real Hilbert space H. 
A sequence (up) C H is said a Palais-Smale sequence 
if f(uy) is bounded and Df(u,) tends to zero. The 
function f is said to satisfy the Palais-Smale 
condition if every Palais-Smale sequence has a 
converging subsequence. 


The Palais-Smale condition readily implies the 
statement above. Assuming also that f is twice 
continuously differentiable, ‘the negative-gradient 
flow of f (a well-defined local flow because Vf is 
continuously differentiable) pulls the sublevel {f < 
c+e} down into {f € c— e] in finite time. These 
observations lead to the following: 


Theorem (Mountain pass). Let f be a twice con- 
tinuously differentiable function on a real Hilbert 
space H, satisfying tbe Palais-Smale | condition. 
Assume that a sublevel (f <a} is not connected, 
and let A, B be two disjoint open sets such that 
AUB={f < a). Then f has a critical point x at level 


f(x) = c:— inf maxf(u) >a 
yer ueg 


where T is the class of all continuous curves in H 
with one end point in A and the other one in B. 
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If we are even more ambitious, and we wish to 
consider functions defined on a real Banach space E, 
we also encounter the problem of not having a 
gradient vector field. Indeed, the differential of f at x, 
Df (x), is an element of the dual space E*, but in this 
case we have no inner product on E by which we can 
represent Df(x) as the product by some vector of E. 
This problem can be overcome by the notion of a 
pseudogradient vector field. In fact, it can be proved 
that if f is continuously differentiable on E, then there 
exists a locally Lipschitz vector field V defined on the 
complement of the critical points of f, such that 


| V(u)]| < mint] Df (u)]|, 1j 
Df (u)|V (u)] > 5 min(lDf (u)], 3 ]Df (u)] 


In other words, even if there is no direction of 
steepest increase for f, we do have directions along 
which the increase of f is steep enough, and these 
directions can be selected in a locally Lipschitz way. 
Notice that pseudogradients are useful also in the 
case of a continuously differentiable function on a 
Hilbert space: in this case the gradient of f is just 
continuous, so it does not generate a flow. The 
Palais-Smale condition, as stated above, makes 
perfect sense on the Banach space E (with the only 
difference that now Df (uj) tends to zero in the dual 
norm of E*), and the mountain-pass theorem holds 
for functions of class C! on a Banach space. 

Actually, the fact that the domain of f has a vector 
structure is not relevant in this statement, and the 
mountain-pass theorem holds also for functions 
defined on connected infinite-dimensional mani- 
folds. Since the essential feature is to dispose of a 
pseudogradient vector field, the right level of 
generality is to consider a Banach manifold M (i.e., a 
manifold modeled on a Banach space) endowed with a 
complete Finsler structure (i.e. a Banach norm on 
each tangent space of M, varying in a suitably regular 
way, inducing a complete distance on M). 


A Nonlinear Elliptic Boundary-Value 
Problem 


Let us consider a typical application of the mountain- 
pass theorem to a semilinear elliptic boundary-value 
problem. Let €) be a smooth bounded domain in R", 
and for À € R, p > 2, consider the problem 


—Au = u+ulul?* ind 


i [1] 
u =O on OX) 


Let 0 < Ay € A» € A3 € --- be the eigenvalues of the 
Laplace operator —A, with domain H? H((Q), the 
Sobolev space of L?-functions on €) with weak first 
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two derivatives in L?, vanishing on 0. We claim that, 
if n — 2, 0rif n > 3and2 <p < 2*:=2n/(n — 2), then 
problem [1] with A < A, has a nontrivial solution. 

By elliptic regularity, the solutions of [1] are 
precisely the critical points of the functional 


£(u) — - | (|Vee(x)? = \u(x)? )dx 


Q 


l p 
aS d 
=f Iu (x)l dx 


We recall that Hj(Q) continuously embeds into 
LP (N), for every p < +00 if n — 2, for every p < 2* if 
n> 3. So the functional € is well defined, and 
actually continuously differentiable, on Hj(Q), a 
Hilbert space with the inner product 


(u,v) en (@) = f v« : Vv(x) dx 


Since p 2, near zero the quadratic part of 
the functional € dominates over the part with the 
L?-norm. By the Rayleigh characterization of the 
first eigenvalue of the Laplacian, 


2 
PE ues 
uEH (AO fo u(x)" dx 


the assumption À < A, implies that the quadratic 
part of £ is positive definite. So we can find a small 
p > 0 such that 


a:= inf £(w)»0 


ILU 
Hy (2) 


On the other hand, the fact that p > 2 implies that 


lim £(uu) = —oo 
jt oc 
for every u #0. Therefore, the sublevel {£ < a] 
is not connected, and if we can prove the 
Palais-Smale condition, the mountain-pass theorem 
will imply the existence of a critical point u with 
E(u) > a > 0, i.e., a nontrivial solution of [1]. 

In order to prove the Palais-Smale condition, 
notice that the expression for the differential of £, 


D£(u)|v| = | vut) - Vv(x) dx 
- [i (au(x) + loo) uo) wx) dx 


N 


and the compactness of the embedding of H((Q) 
into L^(Q) for p « 2* imply that the gradient of 
€ has the form 


VE(u) =u + K(u) [2] 


where K: H}(Q) — H} (Q) is a compact map, that is, 
it maps bounded sets into precompact ones. It is 


readily seen that when V£ has such a form, bounded 
Palais-Smale sequences are compact. Thus, it is 
enough to show that every Palais-Smale sequence is 
bounded. But this follows from the identity 


p&(w) — DE(u) | 
= (5 - ) | (ivo - A(x)? )dx 


together with the fact that the right-hand side term 
defines an equivalent norm on H}(Q), because p > 2 
and A < A,. This concludes the proof. 

Actually, using the maximum principle one could 
show that under the same assumptions, problem [1] 
has a solution which is positive in Q. 

When n 23 and p—2* —2n/(n —2), the func- 
tional f still exhibits a mountain-pass geometry, but 
the Palais-Smale condition fails. In fact, the embed- 
ding of H}(Q) into L” (Q) is not compact, so the 
map K appearing in [2] is not compact, and 
bounded Palais-Smale sequences need not have a 
converging subsequence. We recall that the non- 
compactness of the embedding of H}(Q) into L^ (Q) 
is due to the fact that the quotient 


Is IVau(x)| dx 


SUM) oe 

2 
(Jo iu (x)| dx) 
is invariant under rescaling u> u,(x) = u( ux). 

When A=0, the Pohozaev identity — an integral 
formula obtained by multiplying the equation by 
x -Vu(x) — can be used to prove that problem [1] 
has no nontrivial solutions, when €) is a star-shaped 
domain other than the whole R”. 

When A # 0, the presence in the functional of an 
L?-norm — which rescales differently — breaks the 
symmetry, and the existence of nontrivial solutions 
is again possible. Indeed, Brezis and Nirenberg have 
shown that problem [1] with p=2”* has a nontrivial 
solution provided that n > 4 and 0< à< A1, or 
n — 3 and * < À < \4, for some * € [0, A; ] depend- 
ing on the domain Q. 

The proof is based on the fact that there is a 
certain threshold s > 0, related to the best Sobolev 
constant obtained by taking the infimum of S(z) 
over all u € Hj (the domain is irrelevant here), 
below which the Palais-Smale condition holds. That 
is, every sequence (up) such that E(u) converges to 
some b less than s, and DE(u,) tends to zero, is 
compact. The proof of the mountain-pass theorem 
shows that the Palais-Smale condition is needed 
only at the minimax level c. In order to conclude, it 
is then enough to show that c « s. The value of 
c can be estimated by using the fact that the 


infimum of the quotient $ over functions on the 
whole R” is attained at the family of functions 


s x) = oc) D 
© XQ + pp 


which are then solutions of [1] with p=2*,\=0, 
and Q—R". 

Another way to break the symmetry is to keep 
A — 0 but to consider domains with a rich topology. 
For instance, Bahri and Coron have shown that if Q 
is a domain with some nonzero singular homology 
group H;(Q0;Z3),k » 1, then problem [1] with 
p — 2* and A — 0 has a positive solution. 

Elliptic equations having nonlinearities with the 
critical exponent 2* arise naturally in some geo- 
metric problems. Consider a manifold M of dimen- 
sion 2 > 3, with a metric g having scalar curvature k. 
The Yamabe problem calls for finding a metric go, 
conformally equivalent to g, having constant scalar 
curvature. If go — 4*/"-?, where the positive func- 
tion 4 gives the conformal factor, one finds that 
u must solve the equation 


- ae = —ku + kou|u|" ^ 

where A, is the Laplace-Beltrami operator associated 
with the metric g, and the constant ko is the scalar 
curvature of go. Again, the corresponding functional 
satisfies the Palais-Smale condition only below a 
certain threshold (actually, the same number s as seen 
earlier; this because the lack of compactness is due to 
local concentration phenomena, and the metric 
structure of the whole ambient becomes irrelevant). 
The task is then to show that the minimax level is 
below that threshold or, equivalently, that a certain 
best Sobolev constant for (M,g) is less than the 
corresponding constant for R” with the flat metric 
(the latter constant is again the infimum of S(z)). This 
fact was proved by Aubin in the case n > 6 or (M, g) 
not locally conformally flat. Schoen has then treated 
the remaining case, by means of the positive-mass 
theorem, a deep result in differential geometry. 


A General Minimax Principle 


Let us consider again a twice continuously differ- 
entiable function f on a real Hilbert space H. The 
vector field 


Via) -E 
1+ VFO 


has the same nice properties of the gradient vector 
field of f, but in addition it is bounded. The 
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advantage is that the flow of —V is globally defined. 
When talking about the negative-gradient flow of 
f, we will actually refer to such a flow. It will also be 
useful to dispose of a negative-gradient flow 
truncated below level b. This is the flow of the 
vector field —V,, where 


Vi(u) = vf (u)) V (u) 


with y a smooth function on R which is identically 
zero on [—oo,5], then increases up to reaching the 
value 1, and afterwards remains constantly equal to 
1. This truncated negative-gradient flow keeps the 
points in the sublevel (f < b] fixed, and behaves 
as the negative-gradient flow above b (except the 
fact that trajectories slow down as the value of 
f approaches b). 

After these preliminaries, let us consider again the 
characterization of the critical level c appearing in the 
mountain-pass theorem. This critical level was 
obtained as the infimum over a certain class IT of 
sets ^; — the curves with end points in different 
components of (f < a} — of the maximum of f over y. 
But if we look back at the proof, we realize that the 
fact that these sets were curves was not essential. The 
important feature was that the negative-gradient flow 
Q(t,-) mapped a set of the class F into a set still 
belonging to the class T, for t > 0. This observation 
leads to the following general minimax theorem, due 
to Palais: 


Theorem (General minimax). Let f be a twice 
continuously differentiable function on a real 
Hilbert space H, satisfying tbe Palais-Smale condi- 
tion. Let U be a class of subsets of H which is 
positively invariant under tbe action of the negative- 
gradient flow à of f (possibly truncated below level 
b): that is, if the set y belongs to T, then the set ó(t, y) 
belongs to T for all t > 0. Then, if the number 
c := inf sup f (z) 


YET yey 


is finite (and larger than b), then c is a critical 


value of f. 


The proof goes along the same lines of the proof of 
the mountain-pass theorem: if c is not a critical value 
of f, the (possibly truncated) negative-gradient flow 
Ó(to,:) pulls a sublevel (f < c+} down into the 
sublevel [f < c—e} (with c — e > b), for some large 
to, by the Palais-Smale condition. Then we achieve a 
contradiction choosing a set y € IT on which f does 
not exceed c+ e, and noticing that ó(to,"y) is a set 
which still belongs to the class T, by positive 
invariance, and on which f does not exceed c — e. 

As we shall see in the last section, the possibility 
of working with a truncated negative-gradient flow 
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(assuming in this case that c > b) makes the applica- 
tion of this theorem easier. Again, an analogous 
result holds for continuously differentiable functions 
on Banach spaces, or more generally on Banach 
manifolds with a complete Finsler structure. 

Trivial classes [ are the class of all points in H, 
and the class consisting of the single set H, yielding 
to the infimum and the supremum of f, respectively. 
More interesting classes are constructed by fixing a 
topological space X and considering the images of 
all continuous maps h:X — H belonging to a 
certain relative homotopy class. 


Closed Geodesics on Compact Manifolds 


A typical application of the general minimax 
theorem is Birkhoff proof of the existence of a 
closed geodesic on the sphere $?, endowed with an 
arbitrary metric g. Closed geodesics are precisely the 
critical points of the energy functional 


on the Hilbert manifold H!(T, S^) consisting of all 
one-periodic loops on $? of Sobolev regularity H! 
(here T — R/Z denotes the circle parametrized by 
[0, 1]). This functional satisfies the Palais-Smale 
condition and it is bounded below, but its minima 
are just the trivial constant loops, on which S — 0. 
Let us use angle coordinates (0,45) on $?, —7/2 < 
0 € 1/2,0 € p< 2r (0 is the latitude, y the longi- 
tude). A (suitably regular) map 5:5? — S? induces a 
curve in H'(T,S*) parametrized by 6: the value of 
this curve at @€[-—7/2,7/2] is the loop 
t — b(0,2zt). It is a curve that joins two constant 
loops. Let T be the set of curves in H'(T, S?) which 
are obtained by maps h:S? S^ of topological 
degree 1. This class is clearly positively invariant 
under the action of the negative-gradient flow of 
S (as of every homotopy fixing the constant loops). 
If we can show that the minimax level 


c := inf sup S(x) 


yE r uey 


is positive, we will get a positive critical value of S by 
the general minimax theorem, hence a nontrivial 
closed geodesic. By considering the fact that loops 
with small energy also have a small diameter, it 
is easy to construct a homotopy on {S « a}, for 
some small a » 0, which shrinks every loop to a 
point. If 5:5^— S? determines a curve y with 
maXxxey S(x) <a, composition with this homotopy 


yields to a homotopy of 5 to a map whose image is 
a curve in $7. A further homotopy then shows that 
the map / is homotopic to a constant, which 
is impossible if b has degree 1. This shows that 
€ 2 à 7» 0, concluding the proof. 

Actually, Ljusternik and Fet have proved that 
every compact manifold M has a nontrivial closed 
geodesic. Indeed, if M has nonzero fundamental 
group, it is enough to minimize S on some nontrivial 
homotopy class of loops. Otherwise, the fact that 
M is a compact manifold implies that some homo- 
topy group 74,1(M), 1 € k < dim M, does not van- 
ish. A construction similar to the one described 
above then allows to associate with every noncon- 
tractible map 5:5**! + M a map u:(B*,dB*) > 
(H'(T,M),{S=0}) which is not homotopically 
trivial (here B^ denotes the closed unit ball in R£, 
and the notation means that u maps the boundary 
of the ball B^ into the set of constant loops). Taking 
a minimax over the set of images of the maps 
4 associated with every noncontractible map 
b; S**! — M yields to the desired critical point of 
S with positive energy. 

It is conjectured that every compact manifold has 
infinitely many closed geodesics. Morse theory 
allows to prove this fact for the vast majority of 
manifolds, but not for the spheres. Bangert and 
Franks have established the existence of infinitely 
many geodesics on $^ by proving that every area- 
preserving homeomorphism of the open disk with 
two fixed points must have infinitely many periodic 
points. Proving the existence of infinitely many 
closed geodesics on higher-dimensional spheres is a 
challenging open problem. 


A Rigidity Property of a Certain 
Class of Maps 


It is important that the class [ in the general 
minimax theorem is only required to be invariant 
under the action of the negative-gradient flow, and 
not, say, under the action of any continuous 
homotopy on which the function f is nonincreasing. 
Indeed, too many undesirable things can be done on 
an infinite-dimensional Hilbert space by arbitrary 
continuous maps, whereas the maps arising from 
our negative-gradient flow might show some rigid- 
ity, forcing them to behave as maps on finite- 
dimensional spaces. 

Let us clarify this point by considering the follow- 
ing example, due to Benci and Rabinowitz. It may 
sound a bit artificial at this moment (simpler 
examples could be built), but we will find it useful 
in the next section. Assume that our Hilbert space is 


H- 


Figure 1 The sets S, Q, Q. 


endowed with an orthogonal splitting H = H^ $ H+, 
fix a unit vector 4* in H*, and consider the sets 


S = {u € H*| lul = p} 
O = {u + drut |u € H7, llul <o 0€A«xr) 
OQ = {u + Au* € Q|A € {0,7} or ||u|| = o] 


for some positive numbers p,o,7 such that 7 > p. 
The latter inequality implies that the intersection 
ONS is not empty (see Figure 1). 

If the linear subspace H` is finite dimensional, a 
simple argument involving the topological degree 
shows the following fact: the image of any contin- 
uous map h: Q — H which is the identity on Q has 
nonempty intersection with S. 

When H~ is infinite dimensional, this fact is 
not true anymore. Indeed, it is not difficult to see 
that the set O is homeomorphic to an infinite- 
dimensional closed ball B, by a homeomorphism v 
mapping OO onto the infinite-dimensional sphere 
OB. If B is the closed ball of an infinite-dimensional 
Hilbert space, for instance, the space % of all 
square-summable sequences (x;) endowed with the 
norm |x|, ^ (15, alel), the continuous map 


PA eRe OE, = (v 1 — iran x 


maps B into OB and is a shift operator on OB. 
In particular, it is a continuous map on B without 
fixed points, and it can be used to define a map 
b:B — OB which is the identity on OB, by setting 


h(x) = u(x)x + (1 — u(x))g(x) 
with a(x) > 1 such that |h(x)|, = 1 


Conjugation by the homeomorphism w produces 
a continuous map from OQ to ðQ, which is the 
identity on ðQ, providing us with the desired 
counterexample. 

In other terms, when H` is infinite dimensional, 
the sets OO and S can be unlinked by means of a 
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continuous map. The situation changes if we restrict 
the class of maps 5: Q — H to those of the form 


b(u) =u + K(u) [3] 


where K is a continuous compact map. In this case, 
indeed, the argument for a finite-dimensional H^ 
can be applied, by replacing the topological degree 
by the Leray-Schauder degree (which is invariant 
precisely with respect to homotopies of the form 
above), and one proves that OQ and S cannot be 
unlinked by means of continuous maps of this form. 


Closed Characteristics on Compact 
Energy Hypersurfaces 


Consider R?" with coordinates (p1, ..., Pns d1,- --, du). 
endowed with the standard symplectic form 


n 


w:=dpAdg = dp; ^ dq; 


j=1 


Let X be a compact connected hypersurface in R”. 
The restriction of w to the tangent space T,X has a 
one-dimensional kernel, which varies smoothly with x. 
In other words, there is a smooth line bundle 


Ly := {(x,u) € TX |w(u,v) = 0 Vv € TX} 


over X. We wish to discuss the classical problem 
of finding a closed characteristic for £y, that is, 
a closed curve everywhere tangent to Ly. 

This geometric problem has a dynamical inter- 
pretation. Indeed, let H be a smooth real function on 
R” such that X is the inverse image of the regular 
value 1. The function H - the Hamiltonian - 
generates a vector field Xy on R” by the formula 


w(Xpu(x),u) = -DH(x)u, Wu € R” 


or, equivalently, 


Xutxi=— VH), with (i ri 


The Hamiltonian vector field Xj; is tangent to X and 
belongs to Ly. Therefore, the hypersurface X is 
invariant for the flow of Xj, and the flow orbits are 
precisely the characteristics. So finding a closed 
characteristic on X is equivalent to finding a 
periodic orbit of Xj; with energy H — 1. 

Up to changing the Hamiltonian, we may assume 
that all the values in an interval |1 — 69, 1 + óo[ are 
regular for H, and that the corresponding level sets 
X, :— {H =n} are all connected (hence diffeomorphic 
to 3-4). We would like to sketch Hofer and 
Zehnder's proof of the fact that there is a dense set 
of values 7 € |1 — éo, 1 + ó6o[ for which X, admits a 
closed characteristic. 
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This proof is based on the fact that the one- 
periodic orbits of Xj; are critical points of the action 
functional 


An(x) = | x'(pdq - Hdi) 


EI. E 
=} i) Jo) dr - | H (x(t)) dt 


on the space of loops x: T — R”, 

Clearly, it is enough to show that for every 6 > 0 
there is a closed characteristic on some X, with 
In — 1| « 6. We can take advantage of the fact that 
we are free to change the Hamiltonian, as long as 
it has the level sets X, |n — 1| < 6. Denoting by B 
the bounded component of the complement of 
{1—6<H<1+6}, we may assume that B con- 
tains the origin. We can modify H in such a way 
that H vanishes identically on B, then it grows, 
parametrizing all the hypersurfaces X, |n — 1| « 6, 
in a strictly increasing way, then it remains 
constant in a large ball, and finally it smoothly 
switches to the quadratic form (3/2)z|x|*. By 
choosing H in this way, one can ensure that all 
the constant orbits and all the one-periodic orbits 
which do not lie on X, for some [|y — 1| < 6 have 
non-positive action. So it is enough to prove that 
the functional Ay has a positive critical value. 

Using the Fourier series decomposition 


x(t) — * e?" s Xp € R” 
REZ, 


one sees that the quadratic part of the action 
functional has the form 


] 
x(t) -Jelte dt = "29 k|, |" [4] 


keZ 


so it is positive on an infinite-dimensional linear 
space, negative on an infinite-dimensional linear 
space, and null on the 27-dimensional space spanned 
by the constant loops. The specific form of [4] 
suggests to choose as domain of the action func- 
tional the Sobolev space H! (T, R^"), the space of 
square-integrable one-periodic curves x in R*” with 


llle = Rol” +20 V ^ Rl] |^ < +00 
REZ, 


This is indeed a Hilbert norm on H'/2(T,R2”). The 
functional Ay is smooth on this space, and its 
gradient takes the form 


V An(x) = Lx + K(x) [5] 


where L is the self-adjoint Fredholm operator 
representing the quadratic form [4] with respect to 


the H!?-Hilbert product, and K is a compact map. 
A gradient of the form [5] again implies that 
bounded Palais-Smale sequences are compact. The 
Palais-Smale condition then follows from the fact 
that the Hamiltonian H is quadratic outside a large 
ball, and has no one-periodic orbits there (the large 
orbits are all periodic, but their period is 2/3). 

Consider the splitting H'?(T, R^")  H- 6 H*, 
with 


H -—1ix|£,-—0fot k > 0} 
H" = {x |S, =O tor k < 0) 


Let S, O, and OO be the sets defined in the previous 
section, with 


a(i = RM CN 
v2n 
and constants p,c,7 to be determined. Since the 
quadratic form [4] is positive on H^ and the 
Hamiltonian H vanishes near the origin, we can 
find a small p » 0 such that 


inf Ay (x) 2.0 
xES 


ug ER”, |ug| = 1 


The fact that the quadratic form [4] is seminegative 
on H- and the behavior of H(x) for large |x| imply 
that if o and 7 are suitably large (in particular 
T > p), then 


sup Ay(x) <0 
xcoQo 


Let I’ be the set of all images of maps 
h: Q > H'P(T, R^") 
which are the identity on Q and are of the form 
p(x) = e^ 9) -( + K(x) lé 


with «o a continuous real-valued function, and K a 
continuous compact map. This class of maps is more 
general than the one considered in the previous 
section, but the fact that e^^ commutes with the 
projections onto H and H* ensures that Q and 
$ cannot be unlinked even inside this class. There- 
fore, any y € I has nonempty intersection with S, so 
c := inf sup Ay(x) > inf Aj (x) >0 
YET xey xES 
We would like to apply the general minimax 
theorem, and conclude that c is the desired positive 
critical value. 

The number c being clearly finite, it is enough to 
show that I is positively invariant under the action 
of the negative-gradient flow @ of Ay, truncated 
below level 0. Let y=h(Q)€T and t> 0. Then 
p(t, y) is the image of O by the map @(t, h(-)). This 
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map is the identity on OO because OO lies in 
{Ap € 0] and ó is truncated below level 0. It is of 
the form [6] because by [5] the truncated negative- 
gradient flow of Ay has the form 


p(t, x) = e "(x + K(t,x)) 


for some continuous function 0 € 0(t, x) € t and for 
some continuous compact map K. This concludes 
the proof. 

This result was refined by Struwe, who proved the 
existence of a closed characteristic on 1, for almost 
every 7, in the sense of the Lebesgue measure. We 
could try to use the abundance of closed characteristics 
on energy levels near X to get the existence of one on 
X by taking a limit. But this process produces a closed 
characteristic on X only if we can bound the periods of 
the approximating closed orbits, otherwise a more 
general invariant set results. Actually, Ginzburg, Her- 
man, and Gürel have produced examples of compact 
hypersurfaces without any closed characteristic. 

As conjectured by Weinstein and proved by 
Viterbo, closed characteristics always exist on 
contact-type compact hypersurfaces (i.e., hypersur- 
faces X on which the restriction of w is the 
differential of a 1-form A such that AA dAA---A 
dà is a volume form). In this case, one should even 
expect a multiplicity result. For hypersurfaces which 
bound a strictly convex set in R?", for instance, the 
existence of n closed characteristics is conjectured. 
The best result so far is due to Long, who could 
prove the existence of [5/2] -- 1 of them. Hofer, 
Wysocki, and Zehnder have proved that, when n= 2, 
there are either two or infinitely many closed 
characteristics (for a generic contact-type hypersur- 
face diffeomorphic to $?), by using the already 
mentioned theorem by Franks on periodic points of 
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Introduction 


Mirror symmetry was discovered in the late 1980s 
by physicists studying superconformal field theories 
(SCFTs). One way to produce SCFTs is from closed 
string theory; in the Riemannian (rather than 
Lorentzian) theory the string's world line gives a 
map of a Riemannian 2-manifold into the target 
with an action which is conformally invariant, so 
the 2-manifold can be thought of as a Riemann 


area-preserving homeomorphisms of the disk. Prov- 
ing an analogous result for n > 3 is an intriguing 
open problem. 


See also: Contact Manifolds; Floer Homology; 
Hamilton-Jacobi Equations and Dynamical Systems: 
Variational Aspects; Image Processing: Mathematics; 
Inequalities in Sobolev Spaces; Leray-Schauder Theory 
and Mapping Degree; Ljusternik-Schnirelman Theory; 
Saddle Point Problems. 
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surface with a complex structure. Making sense of 
the infinities in the quantum theory (supersymmetry 
and anomaly cancelation) forces the target to be 
10-dimensional — Minkowski space times by a 
6-manifold X — and X to be (to first order) Ricci 
flat and so to have holonomy in SU(3). That is X is a 
Calabi-Yau 3-fold (X, Q, 4). So SCFTs come from ø- 
models (mapping Riemann surfaces into Calabi-Yau 
3-folds) but, it turns out, in two different ways - the 
A-model and the B-model. Deformations of the SCFT 
and either o-model are isomorphic, so over an open set 
the two coincide. Thus, it was natural to conjecture 
that almost all of the relevant SCFTs came from 
geometry — from an A or B o-model. In particular, 
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the A-model of a Calabi-Yau X should, therefore, 
give the same SCFT as the B-model on another 
Calabi-Yau X. It turns out then that the A-model 
on X should also be isomorphic to the B-model on 
X; thus, mirror symmetry should give an involution 
on a Calabi-Yau 3-folds. (The full picture is 
slightly more complicated - it involves large 
complex structure limits, multiple mirrors and 
flops.) By studying the SCFTs, Greene and Plesser 
predicted the mirror of the simplest Calabi-Yau 
3-fold, the quintic in P^, and mirror symmetry 
was born. 

Topological observables, that is, certain path 
integrals over the space of all maps, can be 
calculated by the semiclassical approximation as 
integrals over the space of classical minima — (anti) 
holomorphic curves in the Calabi-Yau (these mini- 
mize volume in a fixed homology class). From the 
zero homology class we get the constant maps — 
points in X — and so integrals over X. In some cases, 
by Poincaré duality, these can be thought of as 
intersections of cycles; we think of the string world 
sheet lying at a point of intersection. When the 
world sheet has a nontrivial homology class, it 
allows more general “intersections” where the cycles 
need not intersect but are connected by a 
holomorphic curve, giving a perturbation of the 
usual intersection product on cohomology called 
quantum cohomology. Namely, there is a contri- 
bution (a.B)(b.B)(c.B)el *" to the quantum triple 
product a.b.c of three 4-cycles a,b, c € H^! & H^ = 
H4 from each holomorphic curve 5 (of genus 0, in 
the 0-loop approximation to the physics) in X of 
area fw (where w is the Kahler form). The 
A-model correlation functions can be determined 
from these data; the B-model computation involves 
no such quantum correction and can be computed 
purely in terms of integrals over cycles (“periods”) 
and their derivatives (discussed in the next section). 
So it is in some sense easier and, in a historic tour- 
de-force, was calculated by Candelas et al. (1991) 
for the Greene—Plesser mirror of the quintic. 
Comparing with the A-model computation on 
the quintic gave remarkable predictions about the 
number of holomorphic rational curves on the 
quintic. These were way beyond mathematical 
capabilities at the time, and sparked enormous 
mathematical interest. The predictions (and more) 
have now been proved to be true by Givental and 
Lian-Liu-Yau, while mirror symmetry has begun to 
be understood geometrically. But, in some sense, 
the mathematical reason for the relationship 
between the Yukawa couplings and the quantum 
cohomology of the mirror is still a little mysterious; 
it is the hardest part of mirror symmetry to see in the 


geometry, yet for the physics it was the easiest and the 
first prediction. 

We survey, nonchronologically, some of the 
geometry of mirror symmetry as it is now under- 
stood, mainly in dimension n=3. For the many 
topics omitted, the reader should consult the Further 
Reading section. 


The Geometric Setup 


A Calabi-Yau 3-fold (X,Q,w) is a Kahler manifold 
(X,w) with a holomorphic trivialization 2 of its 
canonical bundle 


Kx = ALT*X 


(i.e., a nowhere-vanishing holomorphic volume form, 
locally dz; ^ dz» ^ dz3), and b4(X) — 0. It follows that 
the Hodge numbers 592,59! vanish, and so 
H?(X,C)=H"! and H?(X,R)z H^!--H*9. By 
Yau's theorem the Kahler metric can be changed 
within its H^(X, R) cohomology class to a unique 
Ricci-flat Kahler metric; equivalently, 2 is parallel, so 
the induced metric on Ky is flat. Roughly speaking, 
mirror symmetry swaps the symplectic or Kahler 
structure w on X with the complex structure (encoded 
in Q, up to scaling by €) on the (conjectural) mirror 
X. Kahler deformations are unobstructed, forming an 
open set Kx in H*(X, R). Its closure Kx is sometimes 
extended by adding the Kahler cones of all birational 
models of X to give Kawamata's movable cone. This is 
because the work of Aspinwall, Greene, Morrison, and 
Witten suggested that all birational models of X are 
indistinguishable in string theory and so are all mirrors 
of X, corresponding to a different choice of (1, 1)-form 
w which is a Kahler form on one model only. Kx is also 
complexified by including in the A-model data any 
*B-field^ B € H^(X, R/Z), and divided by holo- 
morphic automorphisms of X, to give a moduli space 
of complex dimension 5b^'(X). Deformations of 
complex structure are also unobstructed by the 
nontrivial Bogomolov-Tian-Todorov theorem; thus, 
they form a smooth space with tangent space 


Lor Yy 2. ply AZ AY — HPd (X 
H (TX) CH (AT'X) H^ (X) 


(Given a deformation of complex structure, the 
above isomorphism takes the H*:'-component of the 
derivative of the (3,0)-form Q.) So, for the moduli 
spaces to match up, we get the first and simplest 
prediction of mirror symmetry: 


bh 1(X) = b*"(X) and P^'(X)s b" (X) [1 


This is where mirror symmetry gets its name, the 
above relation making the Hodge diamonds of X 
and X mirror images of each other. 


As the complexified Kahler cone is a tube 
domain, it has natural partial complex compactifi- 
cations (due to Looijenga, and suggested in the 
context of mirror symmetry by Morrison (1993)). 
The simplest case is where we ignore the movable 
cone and automorphisms and assume that there is 
an integral basis ej,...,6, of both Ky and 
H^(X, Z)/torsion. The complexified Kahler moduli 
space is then 


KS :- H*(X,R)/H? (X, Z)) --iKx = (B -- iw} 


with natural coordinates x;, y; > 0 pulled back from 


the first and second factors, respectively, induced by 
the e;. x; is multivalued with integer periods, so 


zi = exp(27i(x; + iyj)) i2] 


is a well-defined holomorphic coordinate, giving an 
isomorphism to the product of punctured unit 


disks in C: 


KY & (AY = {(z) 0 < Jz] € 1) c (C')" 


The compactification A” comes from adding in the 
origins in the disks, which we reach by going to 
infinity (in various directions) in Ky. We call the 
point (0,...,0) € A” the large Kahler limit point 
(LKLP) in this case. Moving along the ray generated 
by ` kiei € Kx,k; > 0, complexities in the holo- 
morphic structure [2] to give the analytic curve 


Zo-s Wis [3] 


in KE. For ki € Q Vi, this extends to a complete 
curve in the compactification. Without loss of 
generality, we can assume that k; are integers with 
no common factor; then the link of the curve winds 
around the LKLP (0,...,0) € A” with winding 
number 


(ky, s is Ka) e wy (H?(X 
= H*(X,Z) 


| R)/H?(X,Z) + iK) 
= Lei QOO Z.es 


This is because multiplying the ray R.Xhk;e; € Kx 
by i gives the direction R.Xhb;e; in the space 
H^(X,R)/H^(X,Z) of B-fields, with the given 
winding number. For k; not rational we get an 
analytic mess; the direction in the space of B-fields 
does not close up to give a circle. 

There is no obvious mirror to these rays since we 
consider Q only up to scale. So, mirror symmetry 
predicts an isomorphism between Ky and the 
moduli space My of complex structures on X, and 
a distinguished limit in My, the large complex 
structure limit point (LCLP), the mirror of the LKLP 
(0,...,0) € A” above. Morrison has given a rigorous 
definition of LCLPs and the canonical coordinates 
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on My dual to the z; on Ky; see the section 
Monodromy around the LCLP. The holomorphic 
curves in (A)" described above, corresponding to 
rational rays of Kahler forms, give degenerations of 
(the complex structure on) X to the LCLP whose 
monodromy is discussed in this article (see *Lagran- 
gian Torus Fibrations"). 

LCLPs play a vital role in mirror symmetry; in 
fact, mirror symmetry is really a statement about 
LCLPs and families of Calabi-Yau manifolds near 
LCLPs. Most predictions only really hold near or at 
the LCLP, and the complex structure moduli space 
only looks like A" near the LCLP. For instance, 
manifolds can have many LCLPs and accordingly 
many mirrors. This also explains one obvious 
paradox - that rigid Calabi-Yau manifolds, those 
with no complex structure deformations, 5^! — 0, 
and so no LCLP, can have no mirror, since a Kahler 
(or symplectic) manifold has h? — b^! Æ 0. 

The first predicted refinement of [1] is, as 
discussed in the introduction, that the variation of 
Hodge structure (VHS) on X should be describable 
in terms of Gromov-Witten invariants of X. Here 
VHS is governed by how the ray C.Q, — H>?(X,) 
sits inside H?(X,, C) as the complex structure on X, 
varies, parametrized by te My. By Poincaré 
duality, it is sufficient to know how Q, pairs with 
H3(X), that is, to compute the period integrals 


/ (2,, f= Í 
J A; 


where A; form a basis of H3(X, Z). (In fact we can 
choose the A; to be a symplectic basis, Aj.Aj = itk. ;, 
and then knowledge of only the periods of the first k 
A; suffices, locally in moduli space.) These periods 
determine Q; and so the Yukawa coupling 


Ak = 25 4-2 


S (STX) T5 HANK.)eC A 


ITR” 
On X, we get the cubic form on H?(X) described 
earlier in terms of numbers of rational curves in X. 
These numbers are in fact independent of the 
almost-complex structure on X (as long as it is 
compatible with the symplectic form w), and, there- 
fore, give the symplectic invariants of Gromov 
and Witten. The cubic form depends on w=u, 
as it moves in Kx, (or in Ks replacing uw, by 
-i(B + iw,)). Under the predicted local isomorphism 
Ky = My near the LKLP and LCLP, the equality of 
these sable forms gives the predictions of number of 
rational curves in X mentioned in the introduction. 
This has been carried out, and the predictions 
checked rigorously, in quite some generality, for 
instance for mirror pairs produced by Batyrev’s toric 
methods. 
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There is, of course, a flat connection, the Gauss- 
Manin connection on the bundle over My with 
fiber H? (Xa, C) Over peta given by the local 
system H? (Xa Z) C H3(X;,C). As mirror to this, 
Dubrovin has shown how to put a flat connection 
on the bundles with fibers H^(X;) and H*'"(X,) using 
Gromov- Witten invariants. 


Homological Mirror Symmetry 


Building on the work of Witten, Kontsevich (1995) 
proposed a remarkable conjecture that purported to 
explain mirror symmetry, all the more surprising 
because it appeared to have little to do with what 
was thought to be mirror symmetry at the time. The 
conjecture is now reasonably well understood, while 
the link to Gromov- Witten invariants and Yukawa 
couplings is more mysterious, although it is known 
how both data should be encoded in the conjecture. 

Kontsevich proposed that mirror symmetry should 
be explained by a (noncanonical) equivalence of 
triangulated categories between the derived Fukaya 
category D^ (X) of (X,w) and the bounded derived 
category of coherent sheaves D^(X) on its mirror X. 
This second category consists of chain complexes of 
holomorphic bundles, with quasi-isomorphisms 
(maps of chain complexes which induce isomorph- 
isms on cohomology) formally inverted, that 
is, decreed to be isomorphisms. For zero B-field 
the first category should be constructed from 
Lagrangian submanifolds L C X carrying flat uni- 
tary connections A. That is, L is middle- (three-) 
dimensional, and 


F4 —0 


For B Æ 0, this needs modifying to Fa + 27iB.id — 0 
(so, in particular, we require that L satisfies 
[B|,]=0 € H*(L, R/Z)). There are also various 
technical conditions such as the choice of a relative 
spin structure, the Maslov class of L must vanish 
(i.e. the map (O|j/volj):L — C' has winding 
number zero) and we pick a grading on L 
(a choice of logarithm of this map). Morphisms are 
defined by Floer cohomology HF* of Lagrangian 
submanifolds; roughly speaking, this assigns a vector 
space to each intersection point (the homomorph- 
isms between the fibers of the two unitary bundles 
carried by the Lagrangians at this point), made into 
a chain complex by a certain counting of holo- 
morphic disks between intersection points. In-depth 
work by Fukaya-Oh-Ohta-Ono shows that this 
gives the structure of an A*-category which can 
then be “derived” into a triangulated category in a 
formal way by taking “twisted cochains." The 


w|, = 0, 


construction is still very technical and difficult to 
calculate with, but the key points are that we get a 
category depending only on the symplectic structure, 
that certain “unobstructed” Lagrangian submani- 
folds give objects of this category, and that 
Hamiltonian isotopic unobstructed Lagrangian sub- 
manifolds give isomorphic objects. 

Since the introduction of D-branes there is a 
physical interpretation of this conjecture in terms of 
open string theory; the objects of the two categories 
are boundary conditions for open strings, and 
morphisms correspond to strings beginning on one 
object and ending on the other. So, for instance, 
intersections of Lagrangians give morphisms corre- 
sponding to constant strings at the intersection 
point, while the Floer differential gives instanton 
tunneling corrections. 

One paradox this formulation immediately sheds 
light on concerns automorphisms on both sides of 
mirror symmetry. While symplectomorphisms of 
(X,w) are abundant, there are few holomorphic 
automorphisms of a Calabi-Yau X. The former 
induce autoequivalences of D^(X); Kontsevich's 
suggestion is that as a mirror to this there should 
be an autoequivalence of D(X); this need not be 
induced by an automorphism of X. Motivated by 
this, groups of autoequivalences of derived cate- 
gories of sheaves of Calabi-Yau manifolds have 
now been found that were predicted by mirror 
symmetry; a few are mentioned below. Thus, 
homological mirror symmetry suggests that an 
SCFT is equivalent to a triangulated category, 
and the ambiguities in geometrizing an SCFT 
(finding a Calabi-Yau of which it is a o-model) 
are seen in the category — not all automorphisms 
come from an automorphism of a Calabi-Yau 
(e.g., Calabi-Yau manifolds X with equivalent 
derived categories give multiple mirrors to X), 
and not all appropriate categories need even come 
from a Calabi-Yau. Supporting this suggestion, 
Bondal-Orlov and Bridgeland have shown that 
indeed birational Calabi-Yau manifolds X have 
equivalent derived categories. 

Finally, Kontsevich explained how deformation 
theory of the categories should involve derived 
morphisms on the product from the diagonal 
(thought of as a Lagrangian in the A-model, its 
structure sheaf as a coherent sheaf in the B-model) 
to itself, giving quantum cohomology in the 
A-model and Hodge structure in the B-model. For 
instance, the holomorphic disks used to compute the 
Floer cohomology of the diagonal on the product 
X x X give holomorphic rational curves on X. So, 
one should be able to see some parts of “classical” 
mirror symmetry. 


Mirror Symmetry: A Geometric Survey 443 


Below, as we describe more of the geometry of 
mirror symmetry that has emerged since Kontse- 
vich's conjecture, we will mention at each stage how 
his conjecture fits in with it. 


The Strominger-Yau-Zaslow Conjecture 


To recover more geometry from Kontsevich's con- 
jecture, there are some obvious objects of D>(X) 
that reflect the geometry of X — the structure sheaves 
O, of points p € X. Calculating their self-Homs, 
Ext (Orn 0,) & A*T,X & A*C? = H*(T?,C), shows 
that if they are mirror to Lagrangians L in X (with 
flat connections A on them) then we must have 


HF*((L,A), (L, AY) = H*(T?,C) 


as graded vector spaces. Since the left-hand side is, 
modulo instanton corrections, H*(L, C) ", where r is 
the rank of the bundle carried by L, this suggests 
that the mirror should be L = T? with a flat U(1) 
connection A over it. There are reasons why the 
Floer cohomology of such an object should not be 
quantum corrected, and so be isomorphic to 
Ext" (O,, Oy). 

For any Lagrangian L, the symplectic form gives 
an isomorphism between T* L and its normal bundle 
N;; thus, Lagrangian tori have trivial normal 
bundles, and locally one can fiber X by them. 
Thus, one might hope that X is fibered by 
Lagrangian tori, and the mirror X is (at least over 
the locus of smooth tori) the dual fibration. This is 
because the set of flat U(1) connections on a torus is 
naturally the dual torus. 

This is the kind of philosophy that led to 
the Strominger-Yau-Zaslow | (SYZ) conjecture 
(Strominger et al. 1996), although Strominger et al. 
were working with physical D-branes, and not 
Kontsevich's conjecture. Therefore, their D-branes 
are not the “topological D-branes" of Kontsevich, 
but those minimizing some action. That is, instead 
of holomorphic bundles in the B-model, we deal 
with bundles with a compatible connection 
satisfying an elliptic partial differential equation 
(PDE) (e.g., the Hermitian-Yang-Mills equations 
(HYM), or some perturbation thereof); instead of 
Lagrangian submanifolds up to Hamiltonian isotopy 
in the A-model, we consider special Lagrangians 
(sLags) (see eqn [5]). The SYZ conjecture is that a 
Calabi-Yau X should admit a sLag torus fibration, 
and that the mirror X should admit a fibration 
which is dual, in some sense. 

A sLag is a Lagrangian submanifold of a Calabi- 
Yau manifold X satisfying the further equation that 
the unit norm complex function (phase) 


= constant [5] 


(So, sLags have Maslov class zero, in particular.) 
This equation uses the complex structure on X as 
well as the symplectic structure, and the resulting 
Ricci-flat metric of Yau, to define a metric on L and 
so its Riemannian volume form vol. SLags are 
calibrated by Re(e ^O) and so minimize volume in 
their homology class. This is similar to the HYM 
equations on the mirror X, which are defined on 
holomorphic bundles on the complex manifold X 
via a Kahler form w, and minimize the Yang-Mills 
action. The Donaldson-Uhlenbeck-Yau theorem 
states that for holomorphic bundles that are 
polystable (defined using [w], this is true for the 
generic bundle), there is a unique compatible 
HYM connection. Thus, modulo stability, HYM 
connections are in one-to-one correspondence with 
holomorphic bundles. A similar correspondence is 
conjectured, and proved in some special cases, by 
Thomas and Yau, for (special) Lagrangians: that 
modulo issues of stability (which can be formulated 
precisely), sLags are in one-to-one correspondence 
with Lagrangian submanifolds up to Hamiltonian 
isotopy. That is, there should be a unique sLag in 
the Hamiltonian isotopy class of a Lagrangian if and 
only if it is stable. Currently, only the uniqueness 
part of this conjecture has been worked out, but, in 
principle at least, we do not lose much by consider- 
ing only Lagrangian torus fibrations. 

The SYZ conjecture is thought to hold only near 
the LCLPs and LKLPs of X and X; away from these, 
the sLag fibers may start to cross. According to Joyce, 
the discriminant locus of the fibration on X is 
expected to be a codimension one ribbon graph in a 
base $? near the limit points, while the discriminant 
locus of the dual fibration X may be different — that 
is, the smooth parts of the fibration and its dual are 
compactified in different ways. In the limit of moving 
to the limit points, however, both discriminant loci 
shrink onto the same codimension-two graph. In this 
limit, the fibers shrink to zero size, so that X (with its 
Ricci-flat metric) tends, in the Gromov-Hausdorff 
sense, to its base $? (with a singular metric). This 
formal picture has been made precise in two 
dimensions, for K3-surfaces, by Gross and Wilson. 
The limiting picture suggests that if we are only 
interested in topological or Lagrangian torus fibra- 
tions then we might hope for codimension-two 
discriminant loci, and such fibrations might make 
sense well away from limit points. Gross and Ruan 
carry this out in examples such as the quintic and its 
mirror, and makes sense of dualizing the fibration by 
dualizing monodromy around the discriminant locus 


and specifying a canonical compactification over the 
discriminant locus. This gives the correct topology for 
toric varieties and their mirrors, and flips the Hodge 
numbers [1], for instance. Approaching the LCLP in 
a different way (in the example of eqn [3] this 
corresponds to altering the rational numbers k;) can 
give a different graph and different fibration on X; 
the dual fibration can then be a topologically 
different manifold, giving a different birational 
model of the mirror X. 

We focus only on Lagrangian fibrations, as they 
are better behaved and understood. We can expect 
them to be C* fibrations with codimension-two 
discriminant loci, for instance. Below we see how 
to put a complex structure on the smooth part 
of the fibration, but extending this over the 
compactification is much harder and will involve 
"instanton corrections" coming from holomorphic 
disks. Fukaya (2005) has beautiful conjectures about 
this that will explain a great deal more of mirror 
symmetry, but they will not be discussed here. 


Lagrangian Torus Fibrations 


If (X?", w) = B" is a smooth Lagrangian fibration 
with compact fibers, then the fibration is naturally 
an affine bundle of torus groups (i.e., a bundle of 
groups once we pick a Lagrangian O-section — an 
identity in each fiber), and the base B inherits a 
natural integral affine structure: it looks like a 
vector space V with an integral structure V = A ®7 
R up to translation by elements of V. This is the 
classical theory of action-angle variables. T7B acts 
on the fiber X, —7 !(b): by pullback and contrac- 
tion with the symplectic form, c € T;B gives a 
vector field o tangent to Xp, and the time-one flow 
along ø gives the action. By compactness and 
smoothness of X; the kernel is a full-rank lattice 
Ap C T; B, giving the isomorphism 


Xp = T,B/Ay 


We define the integral affine structure on B by 
specifying the integral affine functions f (up to 
translation) to be those whose time-one flow along 
df is the identity (i.e., on the universal cover the time- 
one flow is to a section of the bundle of lattices A). 

The situation that concerns us is where B is a 
3-manifold B (usually S?) minus a graph; then the 
monodromy around the graph preserves the integral 
affine structure: 


my (B) + R^xGL(3, Z) [6] 


A great deal of mirror symmetry can be seen from 
just this knowledge of the smooth locus of the 


fibration; in particular, Gross (1998) has shown 
how mild assumptions about the compactification 
(with singular fibers over B\B) are enough to 
determine much of the topology of X. The dual 
fibration 7 should have the monodromy dual to [6], 
and he shows how this implies the switching of the 
Hodge numbers [1] by the Leray spectral sequence; 
the rough idea being the obvious isomorphism 


Rin, R & ATB & A? '*T*B & R? FR 


induced by a trivialization of A?TB. That is, morally 
speaking, the flipping of Betti numbers arises by 
representing cycles by those with linear intersection 
with the fibers, and replacing this linear space by its 
annihilator in the dual torus. This also agrees with 
the equivalence taking ,Lagrangians to coherent 
sheaves described in the next section. 

The dual fibration zt has a natural complex 
structure; here the affine structure is essential, as in 
general a tangent bundle TB only has a natural 
almost complex structure along its O-section. Since, 
up to translation, locally B & V is a vector space, 


TB2VxVz2VG&gC has a natura complex 
structure which descends to 
ï: X = TB/A* > B [7] 


Gross suggests that the B-field on X should lie in the 
piece 


H'(R'z,R/Z) = H'(TB/A*) 


of the Leray spectral sequence converging to 
H?(X, R/Z). That is, it is represented by a Cech 
cocycle e on overlaps of an open cover of B with 
values in the dual bundle of groups TB/A*. Using 
this to twist [7] and re-glue it via transition 
functions translated by e, we get a new complex 
manifold (e is locally constant, so translation by e is 
holomorphic) which we consider as mirror to X 
with complexified form B + iw. In this way, Gross 
manages to match up complexified symplectic 
deformations of X with complex structures on X. 


The 2-Torus 


Mirror symmetry is nontrivial even for the simplest 
Calabi-Yau - the 2-torus. This can be written as an 
SYZ fibration T? ^ B —S!, and write B as R/aZ 
with its standard integral affine structure induced by 
Z C R. This trivializes T*B — B x R and the lattice A 
in it as Bx Z C B x R. So as a symplectic manifold, 


; TS [0, a] x [0,1] 
TUR “Op-Geat-gn © 


with symplectic coordinates (g,p) in which the 
symplectic form is w=dp A dq (so fp w — a). Again, 
the B-field, b € H'(R'z,R/Z) = H*(T^, R/Z), is in 
H! of the locally constant sections of the dual 
fibration. 

In our trivialization B ~ R/aZ, A* C TB is also 
standard: Bx ZCB*xR, so the mirror has the 
same description as in [8] in which the complex 
structure is standard: Jð, =0,. That is, p + ig gives a 
local holomorphic coordinate. 

For nonzero B-field b 0, twisting the dual 
fibration by 5 gives 

T rms E [0,a] x [0,1] i9 

A — (0,p) ~ (a,b +p), (q.0) ~ (q. 1) 
again with holomorphic structure given by p + ig and 


SYZ fibration 7 being projection onto q. So, as a 
complex manifold the mirror is C divided by the lattice 


A = (Ll, b + ia) 


Changing b to b--1 does not alter this lattice, 
so the construction is well defined for b € R/Z = 
H'(R'z, R/Z), and we have the standard description 
of an elliptic curve via its period point 7 — b + ia in 
the upper half plane (as a » 0). Mirror symmetry 
has indeed swapped the complexified symplectic 
parameter b--ia— [;,(b--iw) for the complex 
structure modulus 7 — b + ia. SL(2, Z) acts on both 
sides (in the standard way on 7, and as symplecto- 
morphisms modulo those isotopic to the identity on 
the A-side) permuting the choices of SYZ fibration. 
We note that in this case the fibrations are special 
Lagrangians in the flat metric, with no singular 
fibers. 

Polishchuk and Zaslow have worked out in detail 
how Kontsevich's conjecture works in this case. 
The general picture for any torus fibration is an 
extension of the fiberwise duality that led to SYZ. 
Namely, Lagrangian  multisections L of the 
fibration, of degree r over the base, give r points 
on each fiber, and so r flat U(1) connections on the 
dual fiber. The resulting U(1)" connections can be 
glued together and twisted by the flat connection on 
L, to give a rank-r vector bundle with connection on 
the mirror. Arinkin and Polishchuk show that 
in general the Lagrangian condition implies the 
integrability condition F^?—0 of the resulting 
connection, giving a holomorphic structure on the 
bundle. Leung-Yau-Zaslow show that the special 
Lagrangian condition gives a perturbation of the 
HYM equations on the connection. Branching of 
sections has been dealt with by Fukaya, and requires 
instanton corrections from holomorphic disks. 
Other Lagrangians with linear intersection with the 
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fibers can be dealt with similarly. T^ is simpler 
because all Lagrangians with vanishing Maslov class 
can be isotoped into straight lines (i.e., sLags in the 
flat metric) with no branching. The upshot is that 
the slope of the sLag over the base corresponds to 
the slope (/[;.;ci/rank) € [206,56] of the mirror 
sheaf. 


The Large Complex Structure Limit 


The LKLP for T^ is clearly lima—oo. On the 
mirror then, the LCLP is at r=b+ia — b ioo, 
the nodal torus compactifying the moduli of elliptic 
curves. Metrically, however, in the (Ricci-) flat 
metric, things look different; if we rescale to have 
fixed diameter, the torus collapses to the base of its 
SYZ fibration, and all of its fibers contract. This is 
an important general feature of the difference 
between complex and metric descriptions of 
LCLPs; see the description of the quintic in the 
next section. 

We note that, as in the compactifications 
discussed in an earlier section, the monodromy 
around this LCLP is given by rotating the B-field: 
b — b+ 1. This gives back the same elliptic curve, 
but after a monodromy diffeomorphism T, which, 
from [9], is seen to be 


T: qq, pp +ga 
On H'(T?) = Z[fiber] & Z[section] this acts as 


t= (o 1 [10] 


This is called a Dehn twist. Picking the O-section 
O={p=0} in the mirror [9] when b=0, this is 
taken to the section 


T(O) = lp = q/aj 


and T is in fact the translation by this section T(O) 
on T*, using the group structure on the fibers (now 
we have chosen a O-section). Again, Gross (1998) 
has shown that this is a general feature of LCLPs. 

If we pick a Kahler structure on this family of 
complex tori, T turns out to be a symplectomorph- 
ism. Importantly, its mirror is not a holomorphic 
automorphism, but an equivalence of the derived 
category of coherent sheaves. As above, the section 
T(O) corresponds to a slope-one line bundle L 
on the mirror, and the monodromy action 
corresponds to 


QL: D” — D> [11] 


on the derived category. Again, this is a more 
general feature of these LCLPs, with L such that 
cı(L) equals the symplectic form which generated 
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the ray along which the original LKLP was reached. 
In general, the SYZ fiber is the invariant cycle under 
T. [10], and, on the mirror, structure sheaves of 
points are invariant under &L. On the cohomology 
of T?, cupping with ch(L) — e?! = 1 + c((L) has the 
same action [10] on H** = Z(c41(L)) @ Z(1). 

Notice we have used the choices of fibration and 
Q-section to produce the equivalence of triangulated 
categories and to equate the monodromy actions. 
Kontsevich's conjectural equivalence is not canonical, 
but is fixed by a choice of fibration and 0-section. In 
turn, a fibration should be fixed by a choice of LCLP 
or LKLP from the resulting collapse (in the Ricci-flat 
metric) onto a half-dimensional $" base. The choice of 
Q-section is then rather arbitrary (as monodromy 
about the LCLP changes it) but determines the 
equivalence of categories. Different choices of section 
give different equivalences, differing, for instance, by 
the monodromy transformation QL [11]. 

Another point of view is that a Lagrangian 
fibration and O-section determine a group structure 
on the fibers and so on the Fukaya category 
(translating Lagrangian multisections by multiplica- 
tion on each fiber). This corresponds to a choice of 
tensor product on the derived category of the 
mirror; the identity for this product is then the 
structure sheaf Ox mirror to the O-section, and an 
ample line bundle is given by the action of the 
monodromy transformation L-— T(Ox); T then 
acts as &T(Ox) [11]. Since X is determined by the 
graded ring 


CD HJ) = CB Hom" (Ox. T'(Ox)) 


j>0 p»0 


one might also try to construct X purely from the 
Q-section O and LCLP monodromy on X, as 


X = Proj CB HF (O, T (O)) 


[»0 


A problem is to show that &.oHF?(O, T'(O)) is 
finitely generated; a related problem is to show that, for 
j > 0, the above Floer homologies vanish except 
for « — 0. 

We now turn to the quintic 3-folds, where we will 
see how to identify the (homology classes of the) 
Q-section and fiber in general using Hodge theory. 


The Quintic 3-Fold 


The simplest Calabi-Yau 3-fold is given by the zeros 
O of a homogeneous quintic polynomial on P^, that 
is, an anticanonical divisor of P^. By adjunction, this 
has trivial canonical bundle, and so is Calabi-Yau. 
By the Lefschetz hyperplane theorem, it has ht! = 1, 


so computing its Euler number to be e— —200, we 
find that b^! —101 gives its number of complex 
deformations. Alternatively, this can be seen by 
showing that all such deformations are themselves 
quintics, then dividing the 126-dimensional space of 
quintic polynomials by the 25-dimensional GL 
(5,C). Thus, its mirror has one complex structure 
deformation and 101 Kahler classes. 

Greene and Plesser prescribed the following 
mirror. Take the special one-dimensional family of 
Fermat quintics 


4 4 
Q, = {Sox a] [= =o} c P^ [12] 
i=0 


1—0 


with the action of  ((ao,...,04) € (Z/5): 
[T, à; — 1j ~ (Z/S)! given by rescaling the x; by 
fifth roots of unity. Dividing by the diagonal Z/5 
projective stabilizer, we get a free (7,/5)? action; the 
mirror of the quintic is any crepant (K-—O) 
resolution of the quotient: 
o em dh 3 
(Z5) 


Different resolutions give different Kahler 
cones whose union is the moveable cone; its complex- 
ification is locally isomorphic to the complex 
structure moduli space of Q. 5^!(Q,) — 101 for any 
crepant resolution, and b^ (Q) =1 corresponds 
locally to the one complex structure deformation 
[12]. In fact, for o? — 1, multiplying xo by a shows 
that Ó, & Ó „a and A? parametrizes the complex 
structure moduli. 

The LCLP is at À — oc, that is, it is the quotient of 
the union of hyperplanes 


o. [ijs =o} 


1-0 
—ixo = 0} U---U {eq = 0] [13] 


This is a union of toric varieties, each with a T? action 
inherited from the toric T^ action on P*. Much more 
generally, Batyrev’s construction considers the 
anticanonical divisors (and even more generally, 
complete intersections) in toric varieties fibered over 
the boundary of the moment polytope, and takes as 
mirror the anticanonical divisor of the toric variety 
associated to the dual polytope. However, most of the 
geometry is visible in this quintic example. 

Equation [13] is the analog of the nodal torus of 
the last section, and we emphasize again that 
metrically it looks nothing like this; the Ricci-flat 
metric collapses the T? toric fibers to the base $? (with 
a singular metric). General LCLPs look rather similar, 


with such *as bad as possible" normal crossing 

singularities. Smoothing a local model (in xo-— 1) 
4 3 4 

1; 24 xi = 0, we can see the tori in {[];_ , x;=€}: 


r= TI = Oi, a = 65, 


[14] 


læs] = 63,34 = | 
X1X2X3 


These are even Lagrangian in the standard symplec- 
tic form on the local model, and fiber the smoothing 
over the base {(6;,62,63)}. It turns out that, 
metrically, these tori (which vanish into the normal 
crossings singularity at the LCLP) actually form a 
large part of the smooth Calabi-Yau. This 
enlightens the apparent paradox between the SYZ 
conjecture and the Batyrev construction, that is, why 
a vertex of the original moment polytope (corre- 
sponding to the deepest type of singularity 
(0,0,0,0) € (II, x; —0]) can be replaced by the 
dual three-dimensional face in the dual polytope. 
This was first suggested by Leung and Vafa. 

Gross and Siebert (2003) exploit this to extend SYZ 
and Batyrev's construction to nontoric LCLP Calabi- 
Yau manifolds; it is only the local toric nature of the 
normal crossing singularities of the LCLP that they 
use. It seems possible that their construction will give 
the mirrors of all Calabi-Yau manifolds with LCLPs. 
Much of mirror symmetry should soon be reduced to 
graphs (the discriminant locus of a Lagrangian torus 
fibration) in spheres, and further graphs over which D- 
branes (such as holomorphic curves) fiber, as in recent 
conjectures of Kontsevich and Soibelman and Fukaya 
(2005). It may soon be possible to write down a 
triangulated category in terms of such data. The full 
geometric story (involving Joyce's description of sLag 
fibrations, for instance) is still some way off, however; 
we cannot even write down an explicit Ricci-flat 
metric on a compact Calabi-Yau. 


Monodromy around the LCLP 


As well as the SYZ torus fiber [14] we can also see a 
Lagrangian 0-section on the quintic and its mirror as a 
component of the real locus of [12] for A» 5. 
Remarkably, like the torus [14], this cycle was already 
described and used by Candelas et al. (1991), long 
before the relevance of torus fibrations was suspected. 

Gross and Ruan have been able to describe the 
quintic and its mirror (at least topologically or 
symplectically) very explicitly as a simple torus 
fibration over this S? with a natural integral affine 
structure and codimension-two graph discriminant 
locus (see, e.g., Gross et al. (2003)). 

Under monodromy about A= oc, the O-section is 
moved to another section T(O), and T is given by 
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translation by T(O) using the group structure on the 
fibers. This is the analog of the Dehn twist [10], and 
one can choose a basis of H3(Q) (with first element 
the invariant cycle, the T?-fiber, second element 
a cycle fibered over a curve in S?, third fibered over 


a surface, and last the O-section itself) such that 


T, — 15] 


© Oo O = 
© C. 
orm x * 
— X% * * 


Like the Dehn twist [10], it turns out that T, is 
maximally unipotent; that is, we have in m-dimensions, 


T- 240 bur {E -1 £0 


Again, this is a general feature of LCLPs as formulated 
by Morrison (1993) as part of the definition. 

This should be compared with the Lefschetz 
operator L =Uw on the cohomology of the mirror, 
which also satisfies L” 4 0, L"*! —0 (or, more 
relevantly, exp(L), which satisfies (e^ — 1)" 40, 
(e- — 1)"*! 2 0). Their similarity was noticed by the 
Griffiths school working on VHS in the late 1960s! 
Now we know that for Calabi-Yau manifolds at an 
LCLP dual to an LKLP along a ray w — c4(L) on the 
mirror, they should be considered mirror operators 
(up to some factors of the Todd class of the 
underlying Calabi-Yau, to do with the relationship 
between the Chern character e^ of the line bundle L 
(see [11]) and the Riemann-Roch formula). 

Both, by linear algebra of the nilpotent operator 
N= log T, = Mu Ou 1), induce a natural 
filtration W,:0 < Wo € --- € W34,—H on the coho- 
mology on which they operate (which is H — H" for 
N= log T, and H —H** for N=L= Uw): 


0 € im(N") € im(N" !)nker(N) <--- 


< ker(N" ^!) -- im(N) € ker(N") < H M 

For a discussion of the construction of this mono- 
dromy weight filtration, the reader is referred to the 
further reading section. It plays a key role in studying 
degenerations of varieties and Hodge structures, in this 
case as we approach the LCLP. It is a beautiful result of 
Gross that this filtration coincides with the Leray 
filtration on H" induced by the fibration. That is, 
under Poincaré duality, the weight filtration on cycles 
is by the minimal dimension (over all homologous 
cycles) of the image in the base over which the cycle is 
fibered. So, the first graded piece is spanned by the 
invariant cycle, the T? fiber, supported over a point, 
and the last by the 0-section; cf. [15]. (Similarly on the 
mirror, the filtration for the Lefschetz operator Ue” 
has first piece spanned by the cohomology class of a 
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point, which is invariant under the monodromy action 
&L of [11], etc.) 

Letting yo be the class of a fiber and y span 
W2/Wo (which is one-dimensional) over the inte- 
gers, then T. —^1 + yo. It follows that 


() 


q = exp “a 


is invariant under monodromy. This is the higher- 
dimensional analog of the coordinate exp (2717) on 
the moduli space of elliptic curves, where 7 is the 
period point. It is this coordinate q that is mirror to 
the coordinate 

| w 

J line 


on the Kähler moduli space on the mirror quintic, 
which allows one to compute the correspondence 
between VHS and Gromov- Witten invariants men- 
tioned in the introduction. 

More generally, following Morrison (1993), one 
can make a rigorous definition of an LCLP using 
features noted above extended to the case of b^! > 0 
(see, e.g., Cox and Katz (1999). Roughly, the 
upshot is that Mẹ (of dimension s = b^ ! (X)) should 
be compactified with s divisors (Dj); , (parametriz- 
ing singular varieties) forming a normal crossings 
divisor meeting at the LCLP, with monodromies T; 
about them. There should be a unique (up to 
multiples) integral cycle yo (our torus fiber) invariant 
under all T;, and cycles (^;);.., such that 
Q 


it 


1,9 


is logarithmic at Dj; that is 7; — (1/(2z1)) log (z;), 
where z; is a local parameter for D; = [z; = 0]. 

So, z;= exp (2rir;) form local coordinates for 
moduli space, mirror to the polydisk coordinates [2] 
on KS. The direction of approach to the LKLP in that 
section corresponds to the holomorphic curve z;' — ze 
[3] we take through the LCLP (2; — 0 Vi), and the 
monodromy *` N;T; varies accordingly, but the 
corresponding weight filtration W, remains constant 
if k; 4 OVi, by a theorem of Cattani and Kaplan. 

Morrison then requires that the (5; , should 
form an integral basis for W = W3 (with yọ a basis 
of Wọ= Wi). Finally, part definition and part 
conjecture, we should be able to make a choice 
such that they satisfy the condition log T;(5;) = iyo- 


Ti = 


Of course, as has been emphasized, Morrison's 
definition of an LCLP is really where the mathematics 
and geometry of mirror symmetry begin, and should 
have been the starting point of this article. But that 
would have required appreciable knowledge of 
abstract VHS that are best understood, in this context, 
through the new geometry of Lagrangian torus 
fibrations that mirror symmetry has inspired. 


See also: AdS/CFT Correspondence; Calibrated 
Geometry and Special Lagrangian Submanifolds; 
Derived Categories; Fourier-Mukai Transform in String 
Theory; Geometric Analysis and General Relativity; 
Geometric Flows and the Penrose Inequality; Geometric 
Measure Theory; Geometric Phases; Number Theory in 
Physics; Riemann Surfaces; Several Complex Variables: 
Compact Manifolds; Topological Gravity, Two- 
Dimensional; Topological Sigma Models; WDVV 
Equations and Frobenius Manifolds. 
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The concept of a moduli space has been used by 
mathematicians for nearly 150 years, although it was 
not until the 1960s that Mumford (1965) gave precise 
definitions of moduli spaces and methods for con- 
structing them. The use of the word “moduli” in this 
context goes back to Riemann in a paper of 1857, in 
which he observed that an isomorphism class of 
compact Riemann surfaces of genus g “hangt 
von 3g — 3 stetig veranderlichen Grössen ab, welche 
die Moduln dieser Klasse genannt werden sollen." 
The idea of moduli as parameters in some sense 
measuring or describing the variation of geometric 
objects has been of fundamental importance in 
geometry ever since. 

Moduli spaces arise naturally in classification 
problems in geometry, particularly in algebraic 
geometry (Mumford 1965, Newstead 1978, Popp 
1977, Seshadri 1975, Sundaramanan 1980, Viehweg 
1995). Algebraic geometry is, roughly speaking, the 
study of solutions of systems of polynomial equa- 
tions in many variables; the solutions to such a 
system form an algebraic variety. A simple example 
of an algebraic variety is a hypersurface, consisting 
of the solutions to a single polynomial equation in 
some number of variables. We can try to classify 
hypersurfaces by their degree and their dimension; 
these are “discrete invariants” for the classification 
problem, but of course they do not determine 
hypersurfaces completely, even if we regard two 
hypersurfaces as equivalent when one is obtained 
from the other after making a change of coordinates. 
It is typical of classification problems in algebraic 
geometry (and other areas of geometry) that there 
are not enough discrete invariants to classify objects 
sufficiently finely, and this is where the concept of a 
moduli space arises. 

In complex algebraic geometry, discrete invariants 
often come from topology. For example, a non- 
singular complex curve (ie., a complex algebraic 
variety which is a connected complex manifold of 
dimension 1, in other words a Riemann surface) 
which is projective (i.e., points have been added at 
infinity to make it compact) is topologically just a 
sphere with a number of handles attached to it; the 
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number of handles is called the genus of the curve 
and is a discrete invariant. Nonsingular complex 
projective curves (or equivalently compact Riemann 
surfaces) are not classified completely by their genus 
g; they are determined by g when regarded simply as 
topological surfaces, but the genus does not deter- 
mine their complex structure when g > 0. 

A classification problem such as this one (the 
classification of nonsingular complex projective 
curves up to isomorphism, or, equivalently, compact 
Riemann surfaces up to biholomorphism), can be 
resolved into two basic steps. 


Step 1 is to find as many discrete invariants as possible 
(in the case of nonsingular complex projective 
curves the only discrete invariant is the genus). 

Step 2 is to fix the values of all the discrete invariants 
and try to construct a *moduli space"; that is, a 
complex manifold (or an algebraic variety) whose 
points correspond in a natural way to the 
equivalence classes of the objects to be classified. 


What is meant by *natural" here can be made 
precise (as we shall see shortly) given suitable notions 
of families of objects parametrized by base spaces and 
of equivalence of families. A “fine moduli space” is 
then a base space for a universal family of the objects 
to be classified (any family is equivalent to the 
pullback of the universal family along a unique map 
into the moduli space). If no universal family exists 
there may still be a “coarse moduli space” satisfying 
slightly weaker conditions, which are nonetheless 
strong enough to ensure that if a moduli space exists it 
will be unique up to canonical isomorphism. 

It is often the case that not even a coarse moduli 
space will exist. Typically, particularly “bad” objects 
must be left out of the classification in order for a 
moduli space to exist. For example, a coarse moduli 
space of nonsingular complex projective curves exists 
(although to have a fine moduli space we must give the 
curves some extra structure, such as a level structure), 
but if we want to include singular curves (which is 
often important so that we can understand how 
nonsingular curves can degenerate to singular ones) 
we must leave out the so-called “unstable curves” to 
get a moduli space. However all nonsingular curves 
are stable, so the moduli space of stable curves of genus 
g is then a compactification of the moduli space of 
nonsingular projective curves of genus g. 

Moduli spaces are often constructed and studied as 
orbit spaces for group actions (using Mumford’s 
geometric invariant theory or more recently ideas due 
to Kollar (1997) and Keel and Mori (1997); geometric 
invariant theoretic quotients can also often be described 
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naturally as symplectic reductions, and it is in this guise 
that many moduli spaces in physics appear. Another 
technique involves period maps, Torelli theorems and 
variations of Hodge structures, initiated by Griffiths 
(1984) and others. In the special case of moduli spaces 
of compact Riemann surfaces, Teichmüller theory can 
also be used (see e.g., Lehto (1987)). 


Remark 1 Recall that a compact Riemann surface 
(i.e., a compact complex manifold of complex dimen- 
sion 1) can be thought of as a nonsingular complex 
projective curve, in the sense that every compact 
Riemann surface can be embedded in some 
complex projective space 


P, = C"*! — (0) /(multiplication by nonzero 


complex scalars) 


as the solution space of a set of homogeneous 
polynomial equations. Moreover, two nonsingular 
complex projective curves are biholomorphic if and 
only if they are algebraically isomorphic. So, there is 
a natural identification between the moduli space of 
compact Riemann surfaces of genus g up to 
biholomorphism and the moduli space of nonsingu- 
lar complex projective curves up to isomorphism. 


There are other situations where an “algebraic” 
moduli space can be naturally identified with the 
corresponding *complex analytic" moduli space, but 
this is not always the case. For example, if we 
consider K3 surfaces (compact complex manifolds 
of complex dimension 2 with first Betti number and 
first Chern class both zero), we find that the moduli 
space of all K3 surfaces has complex dimension 20, 
whereas the moduli spaces of algebraic K3 surfaces 
(which have one more discrete invariant, the degree, 
to be fixed) are 19-dimensional. 

This problem of algebraic moduli spaces versus 
nonalgebraic ones is one reason why the question of 
classifying z:-folds (i.e., compact complex manifolds — 
or, in the algebraic category, nonsingular projective 
varieties — of dimension n) becomes much harder 
when > 1 than in the case n = 1 (which is the case of 
compact Riemann surfaces or nonsingular projective 
curves). Another difficulty is that families of z-folds 
can be “blown up" along families of subvarieties to 
produce ever more complicated families. 


Remark 2 Recall that we blow up a complex 
manifold X along a closed complex submanifold Y 
by removing the submanifold Y from X and glueing 
in the projective normal bundle of Y in its place. We 
get a complex manifold X with a holomorphic 
surjection z: X — X such that x is an isomorphism 
over X — Y and if y € Y then m !(y) is the complex 
projective space associated to the normal space 


T,X/T,Y to Y in X at y. If X-C"'! and Y={0} 
and we identify P, with the set of one-dimensional 
linear subspaces of C"*!, then 


X = {(v,w) € C"! xP,:vew) 
with 7(v, t£) =v. 


Again this problem does not arise when z-— 1, 
because blowing up a 1-fold makes no difference unless 
the 1-fold has singularities (in which case blowing up 
may help to “resolve” the singularities; for example, 
when we blow up the origin {0} in C*, then the singular 
curve C in C? defined by y^ =x? +x? is tranformed 
into a nonsingular curve C with the origin in C replaced 
by two points, corresponding to the two complex 
“tangent directions" in C at 0). 

Thus, the classification of z-folds when n> 1 
requires a preliminary step before there is any hope 
of carrying out the two steps described above. 


Step 0 (the “minimal model programme” of Mori 
(1987) and others): Instead of all the objects to be 
classified, consider only specially *good" objects, 
such that every object is obtained from one of these 
specially good objects by a sequence of blow-ups 
(or similar carefully prescribed operations). 


How to carry out Mori's minimal model program 
is well understood for algebraic surfaces and 3-folds, 
but in higher dimensions is incomplete as yet (Kollár 
and Mori 1998). We shall ignore both step 0 and 
step 1 from now on, and concentrate on step 2, the 
construction of moduli spaces. 


Ingredients of a Moduli Problem 


Formally before posing a moduli problem, we need 
to fix the category in which we are working; that is, 
we need to specify what we mean by “space” and 
“map” in the description below. If, for example, we 
are working in complex analytic geometry then we 
might take “space” to mean a complex manifold (or 
more generally we might allow singularities) and 
take “map” to mean a complex analytic map, 
whereas in algebraic geometry “space” might mean 
an algebraic variety, or a scheme, or even a stack, 
with *map" interpreted as a morphism of algebraic 
varieties (or schemes, or stacks). 

Once this is fixed, the ingredients of a moduli 
problem are: 


1. a set A of objects to be classified, 

2. an equivalence relation ~ on A, 

3. the concept of a family of objects in A with base 
space $ (or parametrized by S$), and sometimes 

4. the concept of equivalence of families. 


These ingredients must satisfy: 


1. a family parametrized by a single point (p] is just 
an object in A (and equivalence of objects is 
equivalence of families over (p]) and 

2. given a family X parametrized by a space S and a 
map @:S — S, there is a family ó* X parametrized 
by S (the “pullback of X along 4$"), with 
pullback being  functorial and preserving 
equivalence. 


In particular, for any family X parametrized by S 
and any s € S, there is an object X, given by pulling 
back X along the inclusion of {s} in S. We think of 
X; as the object in the family X whose parameter is 
the point s in the base space S. 


Example 1 A family of compact Riemann surfaces 
parametrized by a complex manifold S is a surjective 
holomorphic map 


-T:I—$ 


from a complex manifold T of (complex) dimen- 
sion dim(T)= dim(S)+1 to S, such that m is 
proper (i.e. the inverse image c !(C) of any 
compact subset C of S under m is compact) and 
has maximal rank (i.e., its derivative is everywhere 
surjective). Then z^(s) is a compact Riemann 
surface for each s €S, and is the object in the 
family with parameter s. 

The family defined by 7 is an algebraic family if 
7 is a morphism of nonsingular complex projective 
varieties. 


Example 2 A family of nonsingular complex 
projective varieties parametrized by a nonsingular 
complex variety $ is a proper surjective morphism 


"T:l1-—S 


with T nonsingular and 7 having maximal rank. We 
can also allow T and S to be singular, but then we 
require an extra technical condition (that m must be 
flat with reduced fibers). 


In the above example, equivalence of families 
71: T1 — Sı and 75: T? — Sz is given by isomorph- 
isms f: Tj — T? and g:S, — S2 such that gor = 
T2 o f. Equivalence of families in the first example is 
similar. 


Definition 1 A “deformation” of a nonsingular 
projective variety or compact complex manifold M 
is given by a family 7: T — S together with an 
isomorphism 


"T (s) CM 


for some so € S. 
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Strictly speaking, the deformation is the germ at 
so of such a 7; that is, the restriction of 7 over any 
open neighborhood of so in S determines the same 
deformation of M as 7 does. 

A study of deformations leads to information 
about the local structure of moduli spaces. Let 
7:X — S be a deformation of a compact complex 
manifold M —3 (sọ) where so € S. We can cover M 
(thought of as a subset of X) with open subsets W; 
of X such that there exist isomorphisms 


b; : W; — U; x Vj 


where V;=7(W;) is open in $ and U; - M Wi; is 
open in M — (so) and the projection of h; onto V; 
is just 7: W; — Vi. For each i Æj, we then get a 
holomorphic vector field 06; on U;nU; by differ- 
entiating hio hy’ in the direction of any tangent 
vector v € T,,S. These holomorphic vector fields 
define a 1-cocycle in the tangent sheaf O of M. This 
gives us the *Kodaira-Spencer map" 


ps : TaS — H! (M, O) 


Theorem 1 (Kuranishi). If M is a compact com- 
plex manifold, then it bas a deformation x: — S 
with x (so) =M such that 


(i) the Kodaira-Spencer map pr:TS — H! (M, 9) 
is an isomorphism, 

(u) « has the local universal property for deforma- 
tions (i.e., amy deformation of M is locally the 
pullback of x along a map f into S), 

(iii) if HY(M,O) — 0, then the map f in (ii) is unique, 
and 

(iv) if H*(M, 9) —0, then S is nonsingular at sy and 
so dim S= dim H'(M,9). 


This deformation m~ is called the “Kuranishi 
deformation” of M (its germ at so is unique up to 
isomorphism), and S is called the *Kuranishi space" 
of M. 


Example 3 A family of holomorphic (or algebraic) 
vector bundles over a compact Riemann surface (or 
nonsingular complex projective curve) X is a vector 
bundle over X x S where S is the base space (see e.g., 
Verdier and Le Potier (1985)). A deformation of a 
vector bundle E, over © is then given by a vector 
bundle E over a product X x $ together with an 
isomorphism 


Els: ts) = Eo 


for some so € S (strictly speaking it is the germ at so 
of such a family of vector bundles). 
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Fine and Coarse Moduli Spaces 


For definiteness, except when it is specified other- 
wise, let us consider moduli problems in algebraic 
geometry with “space” meaning algebraic variety 
(over some fixed field k which is usually C) and 
*map" meaning morphism of algebraic varieties. 


Definition 2 A “fine moduli space" for a given 
(algebro-geometric) moduli problem is an algebraic 
variety M with a family U parametrized by M 
having the following (universal) property: for every 
family X parametrized by a base space S, there exists 
a unique map $:5$ — M such that 


X wd 


U is then called a “universal family" for the given 
moduli problem. 


Many moduli problems have no fine moduli 
space, but nonetheless there may be a moduli space 
satisfying slightly weaker conditions, called a coarse 
moduli space. If a fine moduli space does exist, it 
will automatically satisfy the conditions to be a 
coarse moduli space. Both fine and coarse moduli 
spaces, when they exist, are unique up to canonical 
isomorphism. 


Definition 3 A “coarse moduli space" for a given 
moduli problem is an algebraic variety M with a 
bijection 


aœ: Afc-—5 M 


(where as before A is the set of objects to be 
classified up to the equivalence relation —) from the 
set A/~ of equivalence classes in A to M such that: 


(i) For every family X with base space S, the 
composition of the given bijection a: A/~— M 
with the function 


Ux : $ — A/~ 


which sends s € S to the equivalence class [X;] 
of the object X, with parameter s in the family 
X, is a morphism. 

(ii) When N is any other variety with 8: A/- —^N 
such that for each family X parametrized by a 
base space S the composition Govy:$ + Nisa 
morphism, then 


Boa :M—N 


Is a morphism. 


Remark 3 For some moduli problems, a family X 
with base space S$ which is connected and of 


dimension strictly greater than zero may exist such 
that for some so € S we have 


(i) X, ~ X, for all s,£ € S — {so} and 
(ii) Xs Z Xs for all s € S — {so}. 


This is the “jump phenomenon," and when it 
Occurs we cannot construct a moduli space including 
the equivalence class of the object X,,. Typically, to 
construct a moduli space, some objects (often called 
"unstable") must be left out because of the jump 
phenomenon and we only get a moduli space of 
“stable” objects. This happens, for example, in the 
construction of moduli spaces of complex projective 
curves, if we want to include singular curves, or 
moduli spaces of vector bundles. 


Example 4 The Jacobian /(X) of a compact Rie- 
mann surface X is a fine moduli space for holo- 
morphic line bundles (i.e., vector bundles of rank 1) 
of fixed degree over X up to isomorphism. As a 
complex manifold 


J(X) = C8/A 


where g is the genus of X and A is a lattice of maximal 
rank in CÙ (in other words /(X)) is a complex torus). 
Since /(X:) is also a complex projective variety, it is an 
“abelian variety.” 

More precisely, /() is the quotient of the 
complex vector space H(X, Kx) of dimension g by 
the lattice H!(X, Z) = Z^*. Here Ks is the complex 
cotangent bundle of © and H?(Y, Ky) is the space of 
its holomorphic sections, that is, the space of 
holomorphic differentials on X. If we choose a 
basis w1,...,Wg of holomorphic differentials and a 
standard basis 7,..., 72g for H41(X, Z) such that 


Yi- Yirg = 1 = Tigi 


when 1 <i<g and all other intersection pairings 
^j; are zero, then we can associate to X the g x 2g 
“period matrix" P(X) given by integrating the 
holomorphic differentials w; around the 1-cycles »;. 
The Jacobian /(X) can then be identified with the 
quotient of C? by the lattice spanned by the columns 
of this period matrix. 

We can in fact always choose the basis w,..., wg 
of holomorphic differentials so that the period 
matrix P(X) is of the form 


(lg Z) 


where I, is the g x g identity matrix. This period 
matrix is called a “normalized period matrix.” The 
Riemann bilinear relations tell us that Z is sym- 
metric and its imaginary part is positive definite. 


Example 5 The moduli space Aş of all abelian 
varieties of dimension g was one of the first moduli 
spaces to be constructed. We have 


A, = H,/Sp(2g;Z) 


where Hg is Siegel’s upper half space, which consists 
of the symmetric g xg complex matrices with 
positive-definite imaginary part. 


Example 6 One way to construct and study the 
moduli space M, of compact Riemann surfaces of 
genus g is via the “Torelli map” 


T 5 Mg — Ag 
given by 
3e J() 


Torelli’s theorem tells us that 7 is injective (cf. 
Griffiths (1984)). Describing the image of Mg in Ag 
is known as the Schottky problem. 


We can calculate the dimension of the moduli 
space M, using Kuranishi theory as in the previous 
section: we get 


dim.M, = dim H!(X,0) = 3g- 3 


for any compact Riemann surface X of genus g > 2. 
In fact, if M is any compact complex manifold and 
there exists a fine moduli space of complex mani- 
folds diffeomorphic to M, then the moduli space is 
locally isomorphic near [M] to the Kuranishi space 
near so. More often, there is only a coarse moduli 
space (as in the case of M,), and then the moduli 
space is locally isomorphic near [M] to the quotient 
of the Kuranishi space by the action of the group of 
automorphisms of M. 

For the Teichmüller approach to Mg (cf. Lehto 
(1987)), we consider the space of all pairs consisting 
of a compact Riemann surface of genus g and a basis 
Vis. «a^ fig FOE FI3(52, 25) as above such that 


Vi-Yitg = 1 = Nite ti 


if 1 <i<g and all other intersection pairings ^j. 
are zero. If g > 2, this space (called Teichmüller 
space) is naturally homeomorphic to an open ball in 
C?* (by a theorem of Bers). The mapping class 
group I, (which consists of the diffeomorphisms of 
the surface modulo isotopy) acts discretely on 
Teichmüller space, and the quotient can be identi- 
fied with the moduli space Mg. This gives us a 
description of M, as a complex analytic space, but 
not as an algebraic variety. 

To construct the moduli space Mt, as an algebraic 
variety, we can use the fact that every compact 
Riemann surface of genus g can be embedded 
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canonically as a curve of degree 6(g— 1) in a 
projective space of dimension 5g — 6. The use of 
the word “canonical” here is a rather poor pun; it 
refers both to the canonical line bundle (or 
cotangent bundle) of the Riemann surface, although 
here “tricanonical” would be more accurate, and 
also to the fact that no choices are involved, except 
that a choice of basis is needed to identify the 
projective space with the standard one P5, &. This 
enables us to identify Mg with the quotient of an 
algebraic variety by the group PGL(z + 1; C). How- 
ever, here we do not have a discrete group action, 
and to construct the quotient we must use Mum- 
ford's geometric invariant theory (see below), which 
was developed in the 1960s in order to provide 
algebraic constructions of this moduli space and 
others. 

In fact, geometric invariant theory also provides a 
beautiful compactification of MM, known as the 
Deligne-Mumford (1969) compactification Mg. 
This compactification is itself a moduli space: it is 
the moduli space of (Deligne-Mumford) stable 
curves, which are complex projective curves with 
only nodal singularities and at most finitely many 
automorphisms. M, is singular but in a relatively 
mild way; it is the quotient of a nonsingular variety 
by a finite group action. 

The moduli space Mg n of nonsingular complex 
projective curves of genus g with » marked points 
has a similar compactification Mos which is the 
moduli space of complex projective curves with 1 
marked nonsingular points and with only nodal 
singularities and finitely many automorphisms. 
Finiteness of the automorphism group of such a 
curve X is equivalent to the requirement that any 
irreducible component of genus 0 (respectively 1) 
has at least 3 (respectively 1) special points, where 
“special” means either marked or singular in X. 

The construction of M, using the period matrices 
of curves and the Torelli theorem leads to a different 
compactification Mg of M, known as the Satake 
(or Satake-Baily-Borel) compactification. Like the 
Deligne-Mumford compactification, Mg is a com- 
plex projective variety, but the boundary of Mg in 
M, has (complex) codimension 2 for g > 3 whereas 
the boundary A of Mf, in Mẹ has codimension 1. 
Each of the irreducible components Ao, ..., Ajg/2} of 
A 1s the closure of a locus of curves with exactly one 
node (irreducible curves with one node in the case of 
Ao, and in the case of any other A; the union of two 
nonsingular curves of genus i and g — i meeting at a 
single point). The divisors A; meet transversely in 
Mg, and their intersections define a natural decom- 
position of A into connected strata which parame- 
trize stable curves of a fixed topological type. 
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For a recent guide to many different aspects of the 
moduli spaces M,, see Harris and Morrison (1998). 


Example 7 Given any nonsingular complex pro- 
jective variety X, we can study the moduli spaces of 
maps from curves to X considered by Kontsevich. 
Intersection theory on these moduli spaces leads to 
Gromov- Witten theory and the quantum cohomol- 
ogy of X, with many applications, for example, to 
enumerative geometry (cf. Cox and Katz (1999), 
Fulton and Pandharipande (1997), Dijkgraaf et al. 
(1995)). 

More precisely, if 2g — 2 +n » 0 then for any 
B € H(X; Z) there is a moduli space Mg ,,(X, 8) of 
n-pointed nonsingular complex projective curves X 
of genus g equipped with maps f : X — X satisfying 
f.[X] = 8. This moduli space has a compactification 
Ms s (X, B) which classifies “stable maps” of type 8 
from n-pointed curves of genus g into X (Fulton 
and Pandharipande 1997). Here, a map f: XX 
from an n-pointed complex projective curve X 
satisfying f,[X]— 8 is called stable if X has only 
nodal singularities and f :X — X has only finitely 
many automorphisms, or equivalently every irre- 
ducible component of X of genus 0 (respectively 
genus 1) which is mapped to a single point in X by 
f contains at least three (respectively 1) special 
points. The forgetful map from M, (X, B) to Mg » 
which sends [X, p1,...,p,, f: 3i X] to [X,p1,...,p«] 
extends to a forgetful map T: Mg (X, B) — Mg n 
which collapses components of X with genus 0 and 
at most two special points. 

Of course, when X is itself a single point, 
Mg n(X, 3) and Me, n(X, B) are simply the moduli 
spaces Mg n and Men. In general Mg ,(X, 3) has 
more serious singularities than Mg „n and may indeed 
have many different irreducible components with 
different dimensions. In spite of this, M g,n(X,5/3) has 
a “virtual fundamental class” [Mg ,(X, 8)]"" lying in 
the expected dimension 


3g—3+"+(1 - g dimX + (TX) 

B 
of Mg. s (X, B). Gromov-Witten invariants (origin- 
ally developed mainly in the case g=0 when 
Mg s (X, B) is more tractable, but now also studied 
when g > 0) are obtained by evaluating cohomology 
classes on M,,,(X,3) against this virtual funda- 
mental class. 


Moduli Spaces as Orbit Spaces 


Example 8 Asa simple example, let us consider the 
moduli space of “hyperelliptic” curves of genus g. 
By a hyperelliptic curve of genus g, we mean a 


nonsingular complex projective curve C with a 
double cover f : C — Pı branched over 2g + 2 points 
in the complex projective line P4. 

Let S be the set of unordered sequences of 2g + 2 
distinct points in P4, which we can identify with an 
open subset of the complex projective space P5,» by 
associating to an unordered sequence a1,...,4551» of 
points in P4 the coefficients of the polynomial whose 
roots are 41,...,42g4.2. Then, it is not hard to 
construct a family X of hyperelliptic curves of genus 
g with base space S such that the curve parametrized 
by 41,...,42g42 is a double cover of P, branched 
Over 41, ...,42g42. This family is not quite a universal 
family, but it does have the following two properties. 


(i) The hyperelliptic curves X; and 1, parametrized 
by elements s and'£ of the base space S are 
isomorphic if and only if s and ¢ lie in the same 
orbit of the natural action of G = SL(2; C) on S. 

(ii) (Local universal property) Any family of hyper- 
elliptic curves of genus g is locally equivalent to 
the pullback of X along a morphism to S. 


These properties (i) and (ii) imply that a (coarse) 
moduli space M exists if and only if there is an 
“orbit space" for the action of G on $ (Newstead 
1978). Here, by an orbit space we mean a 
G-invariant morphism $:S — M such that every 
other G-invariant morphism w:S — M factors 
uniquely through $, and moreover $^ (m) is a single 
G-orbit for each m € M. (We can think of an orbit 
space as the set of G-orbits endowed in a natural 
way with the structure of an algebraic variety.) 

This sort of situation arises quite often in moduli 
problems, and the construction of a moduli space is 
then reduced to the construction of an orbit space. 
Unfortunately, such orbit spaces do not in general 
exist. The main problem (which is closely related to 
the jump phenomenon discussed above) is that there 
may be orbits contained in the closures of other 
orbits, which means that the natural topology on the 
set of all orbits is not Hausdorff, so this set cannot 
be endowed naturally with the structure of a variety. 
This is the situation the geometric invariant theory 
of Mumford (1965) attempts to deal with, telling us 
how to throw out certain “unstable” orbits in order 
to be able to construct an orbit space. For more 
general constructions of orbit spaces which can be 
used for moduli problems where geometric invariant 
theory may not be of use, see Keel and Mori (1997) 
and Kollar (1997). 


Example 9 Let G=SL(2;C) act on (P,)* via 
Mobius transformations on the Riemann sphere 


Py ex LI {o0} 


Then, 


M 
{(x1,%2,%3,%4) € (P1) :x1 = x2 = x3 = x4] 


is a single orbit which is contained in the closure of 
every other orbit. On the other hand, the open subset 


1(x1,x2, x3, x4) € (P,)* :X1, X2, X3, X4 distinct} 


of (P1)^ has an orbit space which can be identified 
with 


P4 — {0,1,00} 
via the cross ratio. 


In order to describe Mumford's geometric invar- 
iant theory, let X be a complex projective variety 
(i.e., a subset of a complex projective space defined 
by the vanishing of homogeneous polynomial 
equations), and let G be a complex reductive group 
acting on X. We also require a “linearization” of the 
action; that is, an ample line bundle L on X and a 
lift of the action of G to L. We lose very little 
generality in assuming that for some projective 
embedding X C P, the action of G on X extends 
to an action on P, given by a representation 


p: G — GL(n+1) 


and taking for L the hyperplane line bundle on P,,. 
Algebraic geometry associates to X C P,, its homo- 
geneous coordinate ring 


A(X) = H? (X, L**) = Clxo, -...xn]/TZx 
k>0 


which is the quotient of the polynomial ring 
C[xo,...,x,] in n+1 variables by the ideal Zy 
generated by the homogeneous polynomials vanish- 
ing on X. Since the action of G on X is given by a 
representation p: G — GL(z + 1), we get an induced 
action of G on C[xo,...,x,] and on A(X), and we 
can therefore consider the,subring A(X)° of A(X) 
consisting of the elements of A(X) left invariant by 
G. This subring A(X)* is a graded complex algebra, 
and because G is reductive it is finitely generated 
(Mumford 1965). To any finitely generated graded 
complex algebra we can associate a complex 
projective variety, and so we can define X//G to 
be the variety associated to the ring of invariants 
A(X)°. The inclusion of A(X)° in A(X) defines a 
"rational" map ó from X to X//G, but because 
there may be points of X CP, where every 
G-invariant polynomial vanishes, this map will not 
in general be well defined everywhere on X (i.e., it 
will not be a morphism). 

We define the set X of “semistable” points in X 
to be the set of those x € X for which there exists 
some f € A(X)* not vanishing at x. Then, the 
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rational map @ restricts to a surjective G-invariant 
morphism from the open subset X* of X to the 
quotient variety X//G. However, o: X? — X//G is 
still not in general an orbit space: when x and y are 
semistable points of X, we have ó(x) — ó(y) if and 
only if the closures Oc(x) and Og(y) of the G-orbits 
of x and y meet in X*. Topologically, X//G is the 
quotient of X* by the equivalence relation for which 
x and y in X“ are equivalent if and only if Oc(x) 
and Ogc(y) meet in X5. 

We define a “stable” point of X to be a point x of 
XS with a neighbourhood in X% such that every 
G-orbit meeting this neighborhood is closed in X*, 
and is of maximal dimension equal to the dimension 
of G. If U is any G-invariant open subset of the set 
X? of stable points of X, then @(U) is an open subset 
of X//G and the restriction |y: U — ¢(U) of ¢ to U 
is an orbit space for the action of G on U in the sense 
described above, so that it makes sense to write U/G 
for ó(U). In particular, there is an orbit space X^/G 
for the action of G on X5, and X//G can be thought 
of as a compactification of this orbit space. 


xs c Xss C X 


open open 

| l 

X/G € X8/~ = XG 
open 


Example 10 Let us return to hyperelliptic curves 
of genus g. We have seen that the construction of a 
moduli space reduces to the construction of an 
orbit space for the action of G—SL(2; C) on an 
open subset S of P5,,5. If we identify P5,» with the 
space of unordered sequences of 2g 4-2 points in 
Pı, then S is the subset consisting of unordered 
sequences of distinct points. When the action of G 
on P55,5 is linearized in the obvious way, then an 
unordered sequence of 2g+2 points in P, is 
semistable if and only if at most g + 1 of the points 
coincide anywhere on P, and is stable if and only 
if at most g of the points coincide anywhere on P; 
(cf. Kirwan (1985), chapter 16). Thus, S is an open 
subset of P5,,5, so an orbit space S/G exists with 
compactification the projective variety P5,,5//G. 
This orbit space is then the moduli space of 
hyperelliptic curves of genus g. 


Other moduli spaces (such as moduli spaces of 
curves and of vector bundles; see e.g., Donaldson 
(1984), Gieseker (1983), Mumford (1965, 1977), 
and Newstead (1978)) can be constructed as orbit 
spaces via geometric invariant theory in a similar 
way. 
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Symplectic Reduction and Moduli Spaces 
of Vector Bundles 


Geometric invariant theoretic quotients are closely 
related to the process of reduction in symplectic 
geometry, and thus many moduli spaces can be 
described as symplectic reductions. 

Suppose that a compact, connected Lie group K 
with Lie algebra k acts smoothly on a symplectic 
manifold X and preserves the symplectic form w. Let 
us denote the vector field on X defined by the 
infinitesimal action of a € k by 


X dx 


By a moment map for the action of K on X we mean 
a smooth map 


u:Xok 
which satisfies 


du(x)(£).a = wx(&, ax) 


for all x € X, € € T, X and a € k. In other words, if 
Ha: X — R denotes the component of u along a € k 
defined for all x € X by the pairing 


Ha(x) = u(x).a 


between u(x) € k* anda € k, then ju, is a Hamiltonian 
function for the vector field on X induced by a. We 
shall assume that all our moment maps are equivariant 
moment maps; that is, ju: X — kř is K-equivariant 
with respect to the given action of K on X and the 
co-adjoint action of K on k’. 

It follows directly from the definition of a 
moment map yp: X — k* that if the stabilizer Ke of 
any C € K' acts freely on yw '(C), then j^ (C) is a 
submanifold of X and the symplectic form w induces 
a symplectic structure on the quotient 44 (C)/K;. 
With this symplectic structure, the quotient 
uw (C)/K; is called the Marsden-Weinstein reduc- 
tion, or symplectic quotient, at ¢ of the action of K 
on X. We can also consider the quotient 4 ! (()/K; 
when the action of Ke on 4! (C) is not free, but in 
this case it is likely to have singularities. 


Example 11 Consider the cotangent bundle T*Y of 
any n-dimensional manifold Y with its canonical 
symplectic form w which is given by the standard 
symplectic form 


dp; ^ dq; [1] 


n 
QJ — 


jo 
with respect to any local coordinates (q1,...,4,) on 
Y and the induced coordinates (pi,...,p,) on its 


cotangent spaces. If Y is the configuration space of a 
classical mechanical system, then T*Y is the phase 


space of the system and the coordinates p= 
(Pi,---,Pn) € lT*, Y are traditionally called the 
momenta of the system. 

If Y is acted on by a Lie group K, the induced 
action on 7*Y preserves w and there is a moment 
map jz: T*Y — k' whose components ji, along a € 
k are given by pairing the moment coordinates p 
with the vector fields on X induced by the 
infinitesimal action of K; that is, 


pap. q) = p.44 


for all 4 € Y and p € T,Y. When K =SO(3) acts by 
rotations on Y — R^, then p is the angular momen- 
tum, or moment of momentum, about the origin. 


The connection with geometric invariant theory 
arises as follows. Let X be a nonsingular complex 
projective variety embedded in complex projective 
space P,,, and let G be a complex Lie group acting 
on X via a complex linear representation p: G — 
GL(z + 1; C). A necessary and sufficient condition 
for G to be reductive is that it is the complex- 
ification of a maximal compact subgroup K (e.g., 
G = GL (m; C) is the complexification of the unitary 
group U(m)). By an appropriate choice of coordi- 
nates on P,,, we may assume that p maps K into the 
unitary group U(z--1). Then, the action of K 
preserves the Fubini-Study form w on P,, which 
restricts to a symplectic form on X. There is a 
moment map pz: X — k” defined (up to multiplica- 
tion by a constant scalar factor depending on 
differences in convention on the normalization of 
the Fubini-Study form) by 


T a 
— X p.(a)x 
2mil|X||? 


p(x).a n 
for all a € k, where x € C"*' — {0} is a representa- 
tive vector for x € P,, and the representation p: K — 
U(n--1) induces p,:k— u(n4-1) and dually 
p :u(n-4-1)Y oR’. 

In this situation, we have two possible quotient 
constructions, giving us the geometric invariant 
theory quotient X//G if we want to work in 
algebraic geometry and the symplectic reduction 
p. (0)/K if we want to work in symplectic geome- 
try. In fact, these give us the same quotient space, at 
least up to homeomorphism (and diffeomorphism 
away from the singularities). More precisely, any 
x € X is semistable if and only if the closure of its 
G-orbit meets js '(0), and the inclusion of j^! (0) 
into X“ induces a homeomorphism 


n *(0)/K + X//G 


There are other quotient constructions closely related 
to symplectic reduction and geometric invariant 


theory, which are useful when working with Kahler 
or hyper-Kahler manifolds. 

In physics, moduli spaces are often described as 
symplectic reductions of infinite-dimensional sym- 
plectic manifolds by infinite-dimensional groups 
(although the moduli spaces themselves are usually 
finite-dimensional). One example is given by moduli 
spaces of holomorphic vector bundles, which 
can also be described using Yang-Mills theory 
(cf. Atiyah and Bott (1982)). 

The Yang-Mills equations arose in physics as 
generalizations of Maxwell's equations. They have 
become important in differential and algebraic 
geometry formulated over arbitrary compact oriented 
Riemannian manifolds, and in particular over com- 
pact Riemann surfaces and higher dimensional Kahler 
manifolds. The fundamental theorem of Donaldson, 
Uhlenbeck, and Yau that a holomorphic bundle over 
a compact Kahler manifold admits an irreducible 
Hermitian Yang-Mills connection if and only if it is 
stable can be thought of as an infinite-dimensional 
illustration of the link between symplectic reduction 
and geometric invariant theory. 

Let M be a compact oriented Riemannian mani- 
fold and let E be a fixed complex vector bundle over 
M with a Hermitian metric. Recall that a connection 
A on E (or equivalently on its frame bundle) can be 
defined by a covariant derivative dą : QE CE) — 
QP* (E), where Q^ (E) denotes the space of 
C*-sections of A'T'M &E (i.e. the space of 
p-forms on M with values in E). This covariant 
derivative satisfies the extended Leibniz rule 


dala ^ B) = (daa) AB+(-1)’a ^ daB 


for a € QE), 8 € Q7,(E), and therefore is deter- 
mined by its restriction d4: (E) — QN(E). The 
Leibniz rule implies that the difference of two 
connections is given by an Ec E'-valued 1-form 
on M, and hence that the space of all connections on 
E is an infinite-dimensional affine space A based on 
the vector space Q| (E & E*). Similarly, the space of 
all unitary connections on E (ie. connections 
compatible with the Hermitian metric on E) is an 
infinite-dimensional affine space based on the space 
of 1-forms with values in the bundle g; of skew- 
adjoint endomorphisms of E. The Leibniz rule also 
implies that the composition da od,4:%,(E) > 
Q,(E) commutes with multiplication by smooth 
functions, and thus we have 


dA O dals) = Fas 


for all C* sections s of E, where F4 € Q%,(gp) is 
defined to be the curvature of the unitary connection 
A. The Yang-Mills functional on the space .A of all 


Moduli Spaces: An Introduction 457 


unitary connections on E is defined as the L?-norm 
square of the curvature, given by the integral over M 
of the product of the function ||F4||^ and the volume 
form on M defined by the Riemannian metric and the 
orientation. The Yang-Mills equations are the Euler- 
Lagrange equations for this functional, given by 


d,*F, =0 


where d4 has been extended in a natural way to 
QV, (g.). The gauge group G, that is, the group of 
unitary automorphisms of E, preserves the Yang- 
Mills functional and the Yang-Mills equations. 

If M is a complex manifold, we can identify the 
space A!) of unitary connections on E with 
curvature of type (1,1) with the space of holomorphic 
structures on E, by associating to a holomorphic 
structure £ the unitary connection whose (0, 1)- 
component is given by the O-operator defined by £. 
This space .A'^ is an infinite-dimensional complex 
subvariety of the infinite-dimensional complex affine 
space A, acted on by the complexified gauge group 
Ge (the group of complex C* automorphisms of E), 
and two holomorphic structures are isomorphic if 
and only if they lie in the same G,-orbit. 

When (M,w) is a compact Kahler manifold, there 
is a G-invariant Kahler form Q on A defined by 


1 . 
O(a, B) = i]. tr(a A B) /\ 7 n 


where n is the complex dimension of M. The Lie 
algebra of G is the space O (gr) of sections of gp, 
and there is a moment map p: A — (O%,(g,))* for 
the action of G on A given by the composition of 


1 
Ai S Fa Aw"! € OFF (gy) 


with integration over M. On A!) the norm square 


of this moment map agrees up to a constant factor 
with the Yang-Mills functional, which is minimized 
by the Hermitian Yang-Mills connections. 

As in the finite-dimensional situation, for a suitable 
definition. of stability, the moduli space of stable 
holomorphic bundles of topological type E over M 
(which plays the role of the geometric invariant 
theory quotient) can be identified with the moduli 
space of (irreducible) Hermitian Yang-Mills connec- 
tions on E (which plays the róle of the symplectic 
reduction). This was proved in general for vector 
bundles over compact Kahler manifolds Uhlenbeck 
and Yau with a different proof for nonsingular 
complex projective varieties given by Donaldson. 

Over a compact Riemann surface M the situation is 
relatively simple, as all connections on E have 
curvature of type (1, 1) and so the infinite-dimensional 
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complex affine space .A can be identified with the 
space C of holomorphic structures on E. A moment 
map for the action of the gauge group on A is given by 
assigning to a connection A c A its curvature Fy € 
OQ (gp), and, after a suitable central constant has been 
added, the Hermitian Yang-Mills connections are 
exactly the zeros of the moment map. 

A holomorphic bundle € over a Riemann surface 
M is stable (respectively semistable) if pu(F) < p(E) 
(respectively p(F) < u(£)) for every proper sub- 
bundle F of £, where 


u(F) = deg(F)/rank(F) 


When the theory of stability of holomorphic vector 
bundles was first introduced, Narasimhan and 
Seshadri proved that a holomorphic vector bundle 
over M is stable if and only if it arises from an 
irreducible representation of a certain central exten- 
sion of the fundamental group 7;(M). Atiyah and 
Bott (1982) translated this in terms of connections to 
show that a holomorphic vector bundle over M is 
stable if and only if it admits a unitary connection 
with constant central curvature. They deduced from 
this the existence of a homeomorphism between the 
moduli space M(n, d) of stable bundles of rank n and 
degree d over M and the moduli space of irreducible 


connections with constant central curvature on a 
fixed C* bundle E of rank n and degree d over M. 


See also: BF Theories; Calibrated Geometry and Special 
Lagrangian Submanifolds; Cohomology Theories; Floer 
Homology; Gauge Theoretic Invariants of 4-Manifolds; 
Gauge Theory: Mathematical Applications; Geometric 
Measure Theory; Geometric Phases; Hamiltonian Group 
Actions; Instantons: Topological Aspects; Intersection 
Theory; Riemann Surfaces; Several Complex Variables: 
Basic Geometric Theory; Several Complex Variables: 
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Introduction 


Since the late 1970s, a particular attention in the 
theory of integrability has been payed to systems 
admitting more than one Hamiltonian representa- 
tion. The first examples belonged to the class of 
infinite-dimensional systems (i.e., partial differen- 
tial equations), like the Korteweg-de Vries (KdV) 
equation, the — Ablowitz-Kaup-Newell-Segur 
system, and many other soliton equations (see 
Bi-Hamiltonian Methods in Soliton Theory). It 
was realized soon that finite-dimensional integr- 
able systems are also likely to possess a 
bi-Hamiltonian representation. Moreover, a geo- 
metric setting for the study of bi-Hamiltonian 
systems was established, with the introduction of 
the so-called bi-Hamiltonian manifolds. They are 
Poisson manifolds with an additional Poisson 
structure, fulfilling a suitable compatibility con- 
dition with the initial Poisson bracket. An 
important program for the study and the classi- 
fication of (finite-dimensional) bi-Hamiltonian 
manifolds was started in the 1990s by Gelfand 
and Zakharevich. They pointed out that the 
geometry of such manifolds is extremely rich 
and complicated. 

In this article we present the basic facts 
concerning the bi-Hamiltonian geometry and its 
relations with the theory of integrable systems, 
referring to Recursion Operators in Classical 
Mechanics in this encyclopedia for the connections 
with separable systems of Jacobi. In the first 
section we give the definitions of bi-Hamiltonian 
manifold and bi-Hamiltonian system, and we 
present some properties of the former. The next 
section contains three concrete examples (the Euler 
top, the open Toda lattice, and a stationary KdV 
flow) and two important classes of bi-Hamiltonian 
manifolds, both related to Lie algebras. This is 
followed by a discussion of the iterative construc- 
tion of first integrals in involution for a given 
bi-Hamiltonian system. This procedure is particu- 
larly efficient in the case of Poisson—Nijenhuis 
manifolds, that is, those bi-Hamiltonian manifolds 
whose second Poisson structure can be obtained by 
composing the first one with a suitable recursion 
operator. 
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Bi-Hamiltonian Systems 


First of all, we recall some fundamental definitions 
from the theory of Poisson manifolds, which are the 
natural setting for the study of Hamiltonian systems. 
Let M be a finite-dimensional C*-differentiable 
manifold and let C*(M) be the space of C*- 
functions from M to R. A Poisson bracket on M is 
a skew-symmetric R-bilinear map 


{.,-}:C°(M) x C (M) + C*(M) 
fulfilling the Jacobi identity 
{{F,G},H}+ {{H, F},G}+{{G,H}, F} =0 
and the Leibniz rule 
(FG, H} = F{G,H}+{F,H}G 


A Poisson manifold is a differentiable manifold 
endowed with a Poisson bracket. Starting from a 
Poisson bracket, one can introduce a tensor field P 
of type (2,0), which we consider as a map from 
T*M to TM, defined by 


(dG, PdF) = {F,G} 


or, using coordinates on M, by P" ={x',x’}. This 
tensor field is called the Poisson tensor associated 
with {-,-}. It is skew-symmetric, and its components 
satisfy the cyclic condition 
j0P^ ,,OP" 0P _ 
j Ox! TE Ox! Su 

meaning that the Schouten bracket [P, P] vanishes. 

On a Poisson manifold, the vector field 
Xy={H,-}=PdH is called the Hamiltonian 
vector field associated with H. In coordinates, 
X,,—P'"OH/Ox'. The Jacobi identity is equivalent 
to the statement that the map H +> Xy, assigning 
to a function H its Hamiltonian vector field Xy, is 
a Lie algebra homomorphism: 


Xr) — [Xr, XG] [1] 


0 


Ox! — 


A Casimir function is a function H such that 
Xj —0, that is, a function which is in involution 
with any other function on M. In terms of the 
Poisson tensor, a Casimir is a function whose 
differential belongs to the kernel of P. 

The most famous class of Poisson manifolds is 
certainly that of symplectic manifolds. They can be 
seen as nondegenerate Poisson manifolds. Indeed, if 
a Poisson tensor P is invertible, then its inverse 
defines a closed nondegenerate 2-form (i.e., a 
symplectic form). Moreover, any Poisson manifold 
turns out to be foliated in symplectic leaves. 
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Let us introduce now the bi-Hamiltonian manifolds, 
which can be considered as a geometric setting for the 
study of integrable Hamiltonian systems. A manifold 
M endowed with two Poisson brackets, {-,-} and {-, -}’, 
is said to be bi-Hamiltonian if the brackets are 
compatible, that is, if any linear combination (with 
constant coefficients) of them is still a Poisson bracket. 
Such a linear combination automatically satisfies all 
properties of a Poisson bracket except the Jacobi 
identity. This is fulfilled if and only if the following 
compatibility condition holds: 


{F,{G, HJ) + IH, C, G}} + {G, {H, F}} 
+ {F, {G, H}'} + {H, {F, G}} 
+ {G, {H, F}'} =0 n 


for any triple (F, G, H) of functions on M. This 
amounts to saying that the sum of the two Poisson 
brackets is also a Poisson bracket. In this case the 
two (compatible) Poisson brackets are said to form a 
Poisson pair. 

There are some interesting equivalent forms of the 
compatibility condition [2]. First of all, in terms of 
the components of the Poisson tensors P and P’, it 
reads 


" yl 
p -zu i + pil cm 4 pe! a ) 
X 
ji; OP jj; OP AR OP 
Fla Bg 9 ag 


that is, the Schouten bracket [P, P’] vanishes. More- 
over, if X-=PdF is the Hamiltonian vector field 
associated with F € C*(M) by means of P and 
Yr — P' dF is the one obtained by P', the compat- 
ibility condition takes the form 


[Xr, Ya] + [Yr. Xe] = Xigcy + Yir) 
VE,G € C*(M) (3] 


to be compared with [1]. Moreover, in terms of Lie 
derivatives we have the equivalent condition 


Lx,P'--Ly,P-0 VF€ C*(M) [4] 


Now we turn our attention to special vector fields 
that can be selected on a bi-Hamiltonian manifold 
M. Let P and P' be the Poisson tensors associated 
with the (compatible) Poisson brackets of M. A 
vector field X on M is said to be bi-Hamiltonian if it 
is Hamiltonian with respect to both Poisson struc- 
tures, that is, if there exist two functions Ho and H; 
such that 


X = P dH; =P’ dHo [5] 


We will see in the following that such vector fields 
are likely to have a number of first integrals in 


involution, and thus they are good candidates for a 
geometric description of integrable systems. The next 
section is devoted to examples of bi-Hamiltonian 
(and multi-Hamiltonian) systems. 


Examples 


The first example is the Euler top, that is, free 
motions of a rigid body with a fixed point. The 
equations of motion are 


and its cyclic permutations. They define a vector 
field in R?, which is well known to be Hamiltonian 
with respect to the Lie-Poisson structure on the 
(dual of the) Lie algebra of 3 x 3 skew-symmetric 
matrices. This means that 


[= {H,T;}, 


LAU , eae. We 
— ea "XS 
is the kinetic energy and the bracket {-, -} is defined 


by (F1, P5] 2 L5 and its cyclic permutations. Another 
Hamiltonian representation is given by 


R= (KY, j=1,23 


f= 12,3 


where 


where 
K = I(T? E Ts? T T3?) 


and the new bracket {-,-}’ is defined by (L4, r2} = 
—Ds/I3 and its cyclic permutations.Any linear 
combination of the two brackets has the form of 
the second one, and it is very easy to show that the 
Jacobi identity is satisfied for such a bracket. 
Therefore, the Euler top is a bi-Hamiltonian system. 
Let us also notice that 


(KIedH.E-—0, f21,2.3 


that is, K is a Casimir function for the Lie-Poisson 
bracket and H is a Casimir function for the new 
Poisson bracket. Hence, we have the following 
(recursion) relations: 


{K, Pj} —0 
(H,Dj) = (K,L5) [6] 
0 = {H,T5}’ 


From a geometrical point of view, the situation is as 
follows. The symplectic leaves of {-,-} are the level 
surfaces of K, that is, spheres, while the symplectic 


leaves of [-, -]' are the ellipsoids H = constant. Their 
intersections are Lagrangian submanifolds for both 
symplectic leaves (in the compact case they are the 
Arnol'd-Liouville tori of the integrable systems, that 
in this case coincide with the trajectories). 

Let us consider now the (three-particle) open 
Toda lattice. It consists in three particles (with 
masses equal to 1) moving on the line under a 
nearest-neighbor interaction of exponential type. 
The Hamiltonian is given by 


H —3(pi^ + p^ + pa^) + exp(qi — 42) + exp(q2 — 43) 


and the system is of course Hamiltonian with 
respect to the canonical Poisson structure of R$, 


00 0-1 0 O 
0000-1 QO 
Pp — 00 0 0 0-1 
100 0 0 0 
0 10 0 0 0 
0 0 1 0 0 0 


But the Toda vector field can also be written as 
P'dK, where K=p + p; + ps is the total momen- 
tum and 


0 1 1 —pi 0 0 
= 0 1 0 —p» 0 
MIL 0 0 -p3 
|^ 0 0 0 e(d1—42) 0 
0 p; 0 =mi) 0 e(42—43) 
0 0 ps —e743) 0 


is a Poisson tensor, which turns out to be compatible 
with P. The generalization to an arbitrary number of 
particles is straightforward. Hence, the open Toda 
lattice is a bi-Hamiltonian system. In the next section 
we will show that this property can be used to 
construct a maximal set of integrals of motion for the 
Toda lattice, which are automatically in involution. 

The third example — a stationary reduction of the 
KdV equation — comes from the field of soliton 
equations. Let us recall that the first members of the 
KdV hierarchy are 


Ou 

—— > diy 

Ot; 

Uu dus Gi (KEY in 

Ro = 4U5xxx — - uation 

2 : i 3 [7] 
= = i (Mxxxxx — 10uuüyxx 


—20UyU xx + 30u ux) 


It is well known how to find finite-dimensional 
reductions for the KdV equation, giving rise to explicit 
solutions. Indeed, the set of singular points of a given 
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vector field of the hierarchy is a finite-dimensional 
manifold which is invariant under the flows of the 
other vector fields, due to the fact that the flows 
commute. The (finite-dimensional) systems obtained 
by restricting the KdV hierarchy to such invariant 
manifolds are called the stationary reductions of KdV. 
Let us consider explicitly the reduction corresponding 
to the third vector field of the hierarchy. The set of its 
critical points is given by 


Hxxxxx — lOuuyé4 — 20n,u4 + 30: nu, —0 [8] 


and its dimension is 5, since we can use the values of 
Uy 1x, User, Urs ANG Uxx, at a fixed point xo (1.e., the 
Cauchy data) as global coordinates. For the sake of 
simplicity, we set 


Uu? = Uxx (xo) 


U4 = Uxxxx (Xo ) 


Up =Uu(xo), Ww = uy(xo), 


H3 = Uxxx(X0), 


In order to compute the reduced equations of the 
first flow of [7], we have to take its x-derivative and 
to use the constraint [8] and its differential 
consequences to eliminate all the derivatives of 
order higher than 4. We obtain the equations 


i S M RS m 
Ot" es Ot; nia Ot; CES Ot; — 9] 
O 
ua = 10ugus + 20u;u — 30uo^u 
Ot! 
In the same way, for the KdV equation we get 
Ou | 
Be =4(u3 — 6ugui) 
o 
=e =4(u4 — 6uou» — 6u;~) 
Our | 2 
de — i(4ugus + 2uju2 — 30uo^ui) 
1 [10] 
5e — i(4uoua + 6uu; + 2u? 
—30ug^u; — 60114?) 
9 
a — 1(10uu4 + 10u97u3 + 104515 


—100ugui5 — 60u;? — 120uo^ui) 


There are two compatible Poisson structures giving 
a bi-Hamiltonian formulation of both systems. The 
corresponding Poisson tensors are 


0 0 0 2 0 
0 0 —2 0 —20uo 
P=} 0 2 0 20uo 20u, 
—2 0 —20uo 0 = 1401 - 20u» 
0 20ug —20u,; 140m 97 +20 0 
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and 
0 1 0 
-- 4 0 — uo 
0 Zuo 0 
P'— 
— uo 3ut —H2 — 15uo? 
—6u, 4uj--15ug? -—us;-— 30ugni 


In fact, if we call X4 and X; the vector fields given 
by [9] and [10], then the following recursion 
relations hold: 
P dHo =0 
X41 = PdH, =P’ dHo 
X3 = PdH> =P dH; 
Q —P'dH; 


[11] 


where 


Ho — u4 + 10uou» + Su,? — 1020? 
Hı —i(2uoua — 2uiu5 + u5^ — 20uo^ us + 15uo^) 
H — 4L (2u»u — uğ uş — us* + 12uou u3 


—16ugu — 12u; u> + 60uo u2 = 36u”) 


Therefore, the vector fields X, and X; are 
bi-Hamiltonian. The geometry of this bi-Hamiltonian 
manifold is similar to the one of the first example. The 
symplectic leaves of both Poisson structures 
have dimension 4, and the Lagrangian foliation 
(given by the level submanifolds of Ho, Hı, and H;) 
is contained in the intersections of such leaves. This 
Lagrangian foliation is called by Gelfand and Zakhar- 
evich the “axis” of the bi-Hamiltonian manifold. 

We also notice that the relations [11] can be 
collected in the statement that the function 
H(A) = Hoà? + H1À + H3 is a Casimir of the Poisson 
pencil P, = P' — AP, that is, 


P,dH(A) = 0 


The importance of the stationary reductions of 
the KdV hierarchy lies in the fact that (as noticed 
in the early works on the subject) the reduced 
equations can be solved by means of the classical 
method of separation of variables. We mention 
that the separability of these systems is a par- 
ticular instance of a general result, which is 
valid for quite a wide class of bi-Hamiltonian 
manifolds. 


Zuo 61 
—3ui —41u5 — 15uo? 
ui + 15ug? us + 30uguj 
0 u4 — A0ugu;4 
30u41? — 60u9° 
—u4 + 4A0ugou; — 0 


30u4? + 60u? 


Next we present an important class of 
bi-Hamiltonian manifolds. We recall that the 
dual q* of a finite-dimensional Lie algebra à 
possesses a canonical Poisson structure, called the 
Lie-Poisson structure. It is defined as 


{F, G}(X) = (X, [dF(X),dG(X)]) — [12] 


where F, G € C™(q*) and their differentials at X € q* 
are seen as elements of q. If Xo is a fixed element in 
q^, the constant Poisson bracket 


{F, G} (X) = (Xo, [dF(X),.dG(X)) — [13] 


is compatible with the Lie—Poisson bracket. In fact, the 
Poisson pencil {- , -}, ={-,-} — Af,- is obtained from 
{-,-} by applying the translation X > X + AX; 
hence, it is a Poisson bracket for every value of the 
constant A. The method of translation of the argument, 
due to Manakov, provides a lot of bi-Hamiltonian 
vector fields for this bi- Hamiltonian manifold. One has 
to consider an Ad -invariant function on q*, that is, a 
function H € C*(q*) such that 


(X,|dH(X),x]}=0 Vxeg, Xem 


It is clearly a Casimir function for the Lie-Poisson 
bracket, and this implies that the function 
X ++ H(X — Xo) is a Casimir of the Poisson pencil. 
If this function can be developed as a Laurent series 
in A, its coefficients H; fulfill the recursion relations 


(Hia) = (HY [14] 


and thus give rise to a sequence of bi-Hamiltonian 
vector fields. 

The last example is a generalization of the 
previous one. For the sake of simplicity, we consider 
a Lie algebra q of matrices such that the trace of the 
product is nondegenerate, and the space M =g? = 
qxq. If Fe C*(M), its differential at a point 
(xo,x1) can be identified with the element (OF/Oxo, 
OF/Ox1) of M given by 


OF OF 
F(xo + evo, x1 + evi) = tr| — vo + —— v] 


dtje=0 Oxo Ox 


for all vo,vi € qa. The manifold M has a three- 
dimensional family of pairwise compatible Poisson 


brackets: 
OF OG 
Ox, ax 1 
o 


{F, Gh (xo, x1) = v H ma) 


OF OG 
US, G}5(x0,%1) = vx b. ce Oa e 


as 2E BU]. [2E GG 
'\ [8x ' Oxo Oxy Ox 


Notice that the first two brackets restrict to the 
submanifolds xọ= constant and give rise to the 
bi-Hamiltonian structure presented in the previous 
example (via the identification between q and q* 
given by the trace of the product). This example can 
be generalized to an arbitrary number n of copies of 
q. In this case there is an (n + 1)-dimensional family 
of pairwise compatible Poisson brackets, which can 
be shown to be Lie-Poisson brackets with respect to 
suitable Lie algebra structures on q”. According to 
Reyman and Semenov-Tian-Shansky, these brackets 
can also be casted in the R-matrix formalism. 

Also in this case, the Ad-invariant functions on q 
give rise to functions in involution on our multi- 
Hamiltonian manifold. For example, if pp” denotes 
the A*-coefficient of tr(x1A + xo)^, then r^ recur- 
sion relations 


(a) Q 
(1H, orl = IH, thes 13 


hold, and they imply the existence of tri-Hamiltonian 
vector fields on M. 

Finally, we mention that the bi-Hamiltonian 
structure of the stationary flow of KdV — discussed 
above — can be obtained as a suitable reduction of 
the multi-Hamiltonian structure on q?, where 
q— S[(2, R). A similar statement holds for the other 
stationary flows of the Gelfand-Dickey hierarchies. 


i F, G}o(x0, x1) = EC 


RS 0, I-9,1 


Iterative Properties and Integrability 


In this section we show how to use the bi- 
Hamiltonian formulation of a given system to explain 
its integrability. In the cases similar to the open Toda 
lattice, where one of the Poisson structures is 
nondegenerate, one can introduce a recursion opera- 
tor and employ its powers in order to generate a 
chain of integrals of motion in involution. In the 
other examples, where the bi-Hamiltonian structure 
is degenerate, the conserved quantities turn out to be 
the coefficients of Casimir functions of the Poisson 
pencil. 
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If (M,{-,-},{-,-}/) is a bi-Hamiltonian manifold, 
we call bi-Hamiltonian hierarchy a sequence {H,}, 0 
of functions on M fulfilling the recursion relations 


{Hk} = {He}, k20 [15] 
In terms of Poisson tensors we have that 
P dH,,, =P’ dH,. A bi-Hamiltonian hierarchy clearly 


gives rise to an infinite sequence of bi-Hamiltonian 
vector fields, 


EPdH-PdH,,, R1 [16] 
The functions H, are in involution with respect to 


both Poisson brackets. Indeed, for k > j, one has 


{Hj, Hk} = (Hj, Hy 4) = UH aas Hy 4) dicidi 
= {H,, Hj} 


so that {H;,H,}=0 for all j,k > 0, and therefore 
(H;, Hi] — 0 for all j,k > 0. If (H;];.9 and {K;};>o are 
two bi-Hamiltonian hierarchies, then all functions 
are in (bi-)involution provided that one of the two 
hierarchies starts from a Casimir of {-,-}. In fact, 
suppose that Ho is such a Casimir. Then 


{ Hi, Ki} 22 (H; 4, Kj = { Hj-1, Kj41} — 
— { Ho, Kii} = 0 


and 
USES = iHa R0 


We observe that these proofs of the involutivity do 
not use the compatibility condition [2] between the 
Poisson structures. The point is that this condition is 
important for the existence of bi-Hamiltonian hier- 
archies. Indeed, the problem of the existence and the 
construction of bi-Hamiltonian hierarchies is quite 
delicate. We tackle it first in the case of a particular 
class of bi-Hamiltonian manifolds, the so-called 
Poisson-Nijenhuis manifolds. In turn, they are a 
generalization of nondegenerate  bi-Hamiltonian 
manifolds. 

Let (M, P,P’) be a bi-Hamiltonian manifold such 
that P is invertible. Then we can introduce the 
tensor field N — P'P^!, which is of type (1,1) and 
will always be dealt with as an endomorphism of the 
tangent bundle TM. This tensor field possesses some 
remarkable properties. First of all, its Nijenhuis 
torsion T(N) vanishes; this means that 


T(N)(X, Y) = [NX, NY] - N[X, Y], = 0 
for any pair (X, Y) of vector fields on M, where 


X, Y], = [NX, Y] + [X, NY] — NIX, Y] 
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Sometimes a tensor field with vanishing Nijenhuis 
torsion is called a recursion operator. Since P defines 
a symplectic structure on M, such a bi-Hamiltonian 
manifold is called an wN manifold. 

The tensor field N satisfies two compatibility 
conditions with P. The first one is simply the 
skew-symmetry of P' and reads NP— PN*, while 
the second one is a restatement of [3], 


Xr, Xc]w = XIR Gy VF,G € C™(M) 


A manifold is said to be a Poisson-Nijenhuis manifold 
(briefly, a PN manifold) if it is endowed with a Poisson 
tensor P and a torsionless (1, 1) tensor field N which 
are compatible, in the sense that the two above- 
mentioned conditions hold. We have just seen that 
every nondegenerate bi-Hamiltonian manifold (i.e., 
such that one of the two Poisson tensors is invertible) is 
a PN manifold. On the other hand, if (M, P, N) is a 
PN manifold, then it can be shown that P' — NP is 
a Poisson tensor, which is compatible with P. In 
other words, PN manifolds are particular examples of 
bi-Hamiltonian manifolds. Moreover, one has that 
PU — NiP and P!- N*P are, for every j,k > 0, 
compatible Poisson tensors. 

Let us consider now a function Ho, on a PN 
manifold (M,P,N), such that N'dHo— dH, is 
exact, where N* : T'M — T*M is the adjoint of the 
recursion operator N. This implies that 


X = PdH,; = PN" dH, = P' dHo [17] 


is a bi-Hamiltonian vector field. By means of N* we can 
define the 1-forms a; = (N*)' dHo, which can be shown 
to be all closed. If they are exact, that is, a, = dH}, then 
the functions H; form a bi-Hamiltonian hierarchy and 
thus are in involution. This shows that on a (simply 
connected) PN manifold every bi-Hamiltonian vector 
field of the form [17], with N* dH = dH, belongs to a 
bi-Hamiltonian hierarchy and that its first integrals (in 
involution) can be iteratively constructed with the 
recursion operator. (The integrability of this vector 
field clearly depends on the number of independent 
integrals of motion.) Moreover, the vector field 
X, =P dH, =P’ dH}; of the hierarchy is Hamiltonian 
with respect to all Poisson structures P! with j > k, 
because X, = P dH, ;. 

The example of the Toda lattice presented earlier 
can be casted in the PN (more precisely, wN) 
framework. One can introduce the recursion opera- 
tor N and, in the three-particle case, one can define 
the third integral of motion as d/ = N* dH. Since K, 
H, and J belong to a bi-Hamiltonian hierarchy, they 
are in involution, and this (along with their 
functional independency) proves the integrability of 
the Toda lattice. 


In this example something more happens: the 
integrals of motion are (up to multiplicative con- 
stants) the traces of the powers of the recursion 
operator N. This is a general fact, since the 
vanishing of the torsion of N implies that N* dl, = 
dI, 4, where I, =(1/k)tr NË. 

Next we deal with the case where the 
bi-Hamiltonian manifold (M,P,P’) is not of the 
Poisson—Nijenhuis type, that is, both P and P’ are 
degenerate. Let us suppose that their symplectic 
leaves have codimension 1. We also want to discuss 
in this case an iteration. problem, namely the 
problem of constructing a bi-Hamiltonian hierarchy 
starting from a Casimir Ho of P. Let us consider the 
Hamiltonian vector field X, = P'dHo = Yp, (using 
the notations introduced earlier). Thanks to the 
form [4] of the compatibility condition between P 
and P', we have that 


Ly P-Ly P= Lx, P'-0 


meaning that X, is an infinitesimal symmetry of P. 
Moreover, X, is tangent to the symplectic leaves of P, 
since (dHo, X4) = (dHo, P' dHo) = 0. Under some sui- 
table topological assumptions, we can conclude that 
there exists a function H4 such that X, = P dHy,, that 
is, X4 is a bi-Hamiltonian vector field. Now the 
procedure can be iterated, that is, in the same way one 
can show that, if X? = P' dH; = Yy,, then there exists 
a function Hə such that X = P dH», and so on. Thus, 
one obtains a bi-Hamiltonian hierarchy {H,},+9, 
which can either be infinite or end with a Casimir of 
P'. In any case, the function H(A) = 28 H,A~* isa 
Casimir of the Poisson pencil P, =P’ — AP. As seen 
earlier, the typical situation is that the chain terminates 
with a Casimir H, of P', where dim M — 2n + 1. In 
other words, there is a Casimir of the Poisson pencil 
which is a polynomial of degree » in the parameter A. 

As a general procedure for constructing 
bi-Hamiltonian hierarchies, one can look for the 
Casimir functions H(A) of the Poisson pencil which 
are deformations of Casimir functions of P, but it is 
not clear when such a deformation does exist in the 
case where the corank of the bi-Hamiltonian structure 
is greater than 2. Nevertheless, suppose that H(A) = 
ao H,A~* is a Casimir of P,, that is, that {Hz} y>0 is 
a bi-Hamiltonian hierarchy. Then, for all A, the 
bi-Hamiltonian vector fields X}; = PdH;,, =P’ dH; 
are Hamiltonian with respect to P,, with Hamiltonian 
function H' (A) = ae HAS, 


X441 = PydH")(A) 


Therefore, the vector fields X; are not only 
bi-Hamiltonian, but they are Hamiltonian with 
respect to any Poisson bracket of the pencil. 


In this article we have described some basic 
properties of bi-Hamiltonian systems, defined on 
manifolds possessing a Poisson pair. There are other 
important vector fields on these manifolds (more 
precisely, on wN manifolds). They are called cyclic 
systems of Levi-Civita, and they give an intrinsic 
description of the separable systems of Jacobi. We 
refer to the article Recursion Operators in Classical 
Mechanics in this encyclopedia for these topics. 


See also: Bi-Hamiltonian Methods in Soliton Theory; 
Classical r-Matrices, Lie Bialgebras, and Poisson Lie 
Groups; Integrable Systems and Algebraic Geometry; 
Integrable Systems and Recursion Operators on 
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Introduction: Multiple-Scale 
and Multiscale Approaches 


Multiscale, or more precisely multiple-scale, 
method is a technique of perturbation theory 
based on the introduction of additional rescaled 
variables, say time variables, formally considered as 
independent variables and describing each a differ- 
ent timescale (for the sake of simplicity, we will 
mainly consider a dynamic framework and time- 
scales; all can be transposed to spatial dependences 
and scales). It was first developed to handle 
singular situations in which dynamic regimes of 
different characteristic scales coexist and intermin- 
gle in such a way that straightforward perturbation 
expansions are not uniformly convergent in time 
(hence of limited relevance and use) due to the 
so-called secular terms growing unbounded with 
time; the freedom introduced together with the 
extra variables indeed allows to impose conditions 
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preventing these secular divergences and improving 
the convergence of the perturbation series. It yields 
a global perturbation solution describing jointly the 
behavior at small and large scales. This technique 
belongs to the far more wide-ranging class of 
multiscale approaches; these can be divided into 
four main subclasses: 


1. Mean-field techniques exploiting scale separation 
between fast and slow components of the 
dynamics. The influence of the slow variables 
onto the fast dynamics, if any, is treated in a 
decoupled way within a parametric approxima- 
tion, allowing an adiabatic elimination of fast 
variables (see the section “Slow/fast variables"). 

2. Singular perturbations, in which individual fast 
components ultimately give rise to slow trends 
and influence the large-scale features. Scale 
separation here breaks down at long times and 
multiple-scale method is then a method of choice 
(see the next section). 

3. Matched expansions when regimes of different 
scales succeed (boundary-layer singularity; see 
the section “Boundary layers and matched 
expansions”). 
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4. Renormalization techniques, in systems exhibiting 
some kind of universality in the relations between 
their behaviors at different scales, for example, 
scale invariance (see the section “Renormalization: 
an iterated multiscale approach”). 


We will first present the principles of multiple- 
scale method, detail its technical implementation on 
simple abstract examples and cite some typical 
applications. Then we will articulate this technique 
with more general multiscale methods in a brief 
overview (see the section “A brief overview of 
multiscale approaches”). The range of multiscale 
approaches and technical tools will then be illus- 
trated and compared in the context of diffusion, 
Brownian motion, and transport phenomena (see the 


section “Summary: the exemplary case of 
diffusion"). 


Multiple-Scale Method: Principles 


Context: Singular Perturbations and Secular 
Divergences 


Multiple-scale methods have been developed to 
handle situations in which the dynamics involves a 
small parameter c (e.g., the ratio of the masses of 
different subsystems, the strength of an additional 
interaction, the amplitude of an applied field) 
directly controlling the separation between the 
different characteristic timescales of the evolution 
and, specifically, such that the behavior for «=0 is 
qualitatively different from the behavior for e small 
(c« 1 but finite); in other words, when a weak 
influence, of strength controlled by € < 1, does not 
have only weak consequences. Typically, this occurs 
when e€ represents the strength of a weak coupling 
between otherwise independent subsystems or when 
a vanishing value e = 0 changes a characteristic time, 
the sign of a friction coefficient, the order of the 
highest time derivative in case of ordinary differ- 
ential equations (turning points), or the type of 
partial differential equations in case of spatially 
extended systems. Accordingly, a naive perturbative 
approach with respect to e, that is, an expansion 
taking as a basic approximation the behavior for 
c—0, cannot bridge the qualitative gap with 
behaviors observed for e > 0. It thus fails to give a 
full account of the system evolution at all times: one 
speaks of singular perturbation. 

A historical example arose in celestial mechanics, 
in the celebrated zonintegrable three-body problem, 
involving the Sun, a big planet and a smaller one, of 
respective masses mı, mı < mı and m3 « m». The 
straightforward approach would be to consider the 
presence of the small planet as a small perturbation 


of the integrable two-body problem for the masses 
m; and m. But when one tries to determine the 
solution as a series in powers of the mass ratio 
€=m3/m2, unbounded terms appear, the so-called 
secular terms, increasing without bounds as fast as f, 
hence of ill-defined order and impairing the very 
consistency of the perturbation approach at long 
times £ > 1/e. Accordingly, the perturbation expan- 
sion is not uniformly convergent in time, preventing 
from using it to investigate asymptotics and deter- 
mine the fate of the three-body system: the influence 
of the small planet on the motion of the bigger one, 
although seemingly a weak perturbation, might 
ultimately modify its trajectory around the Sun, at 
least in some resonant cases. 

The origin of secular terms lies in a phenomenon 
of resonance, which is best explained on an 
example: the Duffing oscillator X + x= —ex? with 
€ {X 1. When looking for a solution in the form 
x(t) — M e'x,(t), each component x,(t) has to be 
bounded in order to get a consistent perturbation 
expansion, in which the hierarchy of terms of 
different orders remains valid forever: ex,i1(£) << 
x,(t). These components should satisfy the following 
sequence of equations: 


za s. 3 
X9--Xp =O, +51 = —X, 


(linearized operator Lx = X + x) [1] 


It gives xo(t) —-ae" + c.c., from which follows a 
secular contribution (3i/2)a|a te! in xi(t) In 
general, solving perturbatively z—/f(z,c) for an 
expansion z(e,27)— >>, €"z,(t) yields a hierarchical 
sequence of equations of the form ĉn, = Lz, + Yn 
(20,21,...,24—1) for 1 2 1, where L—Df(z9,€— 0) 
comes from the linearization in zo of the unperturbed 
evolution law. A secular divergence arises in z, as 
soon as Y„ contains an additive contribution which is 
an eigenvector of L (part of a mathematical result 
known as the Fredholm alternative). The appearance 
of secular terms reflects a singular feature of the 
dynamics: the fact that the limits as c — 0 and t — oc 
do not commute. As a rule, such noninversion is 
associated with generalized secular divergences: the 
fast, short-term dynamics finally contributes to the 
slow, long-term behavior. This feature is a clue 
towards using multiple-scale method. 


Technical Principles 


The first step is to perform rescalings leading to 
dimensionless variables and functions, which evidence 
a small control parameter e, related to scale separation 
and providing a natural parameter for a perturbation 
approach. The basic principle of multiple-scale 
method is to introduce additional independent time 


variables £;,12,...,£, such that the physical situation 
corresponds in this extended time-variable space to 
the line 


to —t, t1 =e, t3 — ét, 
d ð 8, [2] 
dt Oto Ot; Ot» 


It thus amounts to a perturbation expansion of the 
time-derivative operator. This method can be traced 
back to the Lindstedt-Poincaré technique, where the 
time variable ? is expanded according to t=s(1 + 
ew, + ew) +--+) and the evolution described in 
terms of the new variable s and unknown frequencies 
(w;);5, to be determined self-consistently (Nayfeh 
1973). By contrast, the multiple-scale approach puts 
on a par ty=¢ and the additional variables (1;);.,. 
The perturbation approach is then carried out as 
usual, plugging eqn [2] for d/dt and the expansion 
z(é,t)= $ 50 €" Z&(to,11,12,...) into the evolution 
equation and identifying term-wise the coefficients 
of the successive powers of e. The additional freedom 
thus introduced when considering (t;);>ọ as indepen- 
dent variables will be compensated in the course of 
the computation, by imposing “solubility conditions” 
ensuring the vanishing of secular terms and the 
consistency of the perturbation method. In particular, 
it is possible to freely choose boundary conditions 
outside the physical line £1 = eto, . . ., ta = €"tg. The 
resulting set of equations contains exactly the same 
information as the original one, only expressed in a 
different way: by construction, terms depending, say, 
on £o, describe a fast component with no emerging 
slow trends that would intermix with the t4- 
dependence; fast variables contribute only to fast 
modes. At the end, one restricts to the physical line, 
thus turning back to the single “real” variable t. The 
benefit of the method is to provide a joint access to 
dependences at different scales, now expressing as 
dependences onto the different time variables 
to, 11,...,t,. One introduces as many new variables 
as necessary to circumvent secular divergences. We 
have implicitly supposed above that the behavior at 
timescale At=QO(1) corresponds to the fastest 
timescale of the evolution. If it were not the case, 
the rescaled time variables would be tp =&”?, 
tı — etl... if the fastest timescale is At=O (e), 
More general time-derivative expansion, associated 
with rescaled variables t, = «°"t might be considered 
to better account for the hierarchy of characteristic 
timescales of the dynamics. 


Multiple-Scale Method: Abstract Examples 


Let us first consider the simplest possible example 
x =a(1 + €)x, for which the exact solution is trivially 
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known, allowing to appreciate the validity of the 
multiscale approach compared to the straightforward 
perturbation expansion. In the latter case, one looks 
for a solution x(t) = xo(t) + €x1(t) + Ole) and identi- 
fies term-wise the powers of e. At order 0, xo = axo 
yields xo(£) = coe^'. At order 1, x1 — axı = xo(t) leads 
to a secular divergence: x1(£) — cłe” + cote”. Carry- 
ing on the perturbation analysis yields the following 
expansion: 


x(t) — e e*(1-r-et er 2-4 ---) [3] 


which is not uniformly convergent: for t — O(1/«), all 
terms are of the same magnitude. Using this recursive 
method to obtain a finite-order approximate solution 
(e.g., stopping, as here, after two steps of the 
perturbation method) is only relevant at short times 
t<1/e. The straightforward perturbation analysis 
captures the behavior of the exact solution only if all 
terms are computed and taken into account (in less 
trivial examples, the straightforward perturbation 
series might even be divergent). In the multiple-scale 
approach, one introduces two rescaled variables to = t 
and t; = e£ and looks for a solution of the form x(t) = 
Xo(to,t1,.--) + ex1(to, t1, ...) + Ole}. At order 0, 
Ó,xo-—axo yields xo(fo,11,...) —co(t1,...)e^*. At 
order 1, we get Oj,x1 + 0, xo = xo + axı. The solubil- 
ity condition writes aco — ô} co — 0, which allows as 
to avoid secular divergence and suppresses the 
artificial freedom introduced with the additional 
time variable tı, yielding co — ce^!'. The equation 
(O4, — a)x1 =Q is here superfluous, but in less simple 
situations, it remains at this stage a nontrivial 
equation for x4. One thus directly gets the solution, 
uniformly valid at all times: 


x(t) —Ó e7"! eto =i e^ 10! [4] 


As a rule in singular perturbation method, the 
difficulty here originates in the noncommuting limits 
€ — 0 and t — oc; indeed, denoting y,(t)=x,(t)e™, 
one has lim, ə lim, .,9- yelt) =c, whereas lim, .o: 
lim; os Pelt) = co. 

Other training examples are the weakly damped 
linear oscillator X + x= —2ex, solved with multiple 
scales to =t, tı =et,t; — €t, or with the more spe- 
cific variables 0 — V1 — e^t, 7 — et; the Duffing oscil- 
lator x+x—=-—ex? introduced above, whose 
multiple-scale resolution requires three variables 
to =t, ti =et,t; = €t; and the Van der Pol oscillator 


x + x= e(1 — x?)x. 


An Illustration: Classical Lorentz Electron Gas 


in a Weak Field 


As a less abstract, hence more convincing, illustra- 
tion of the strength of multiple-scale method, let us 
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consider the dynamics of a classical Lorentz electron 
gas acted upon an external electric field (associated 
acceleration a). This model considers the electrons 
as charged hard spheres whose motion results from 
the superimposition of a driven classical motion in 
the field and elastic collision on immobile scatterers 
(the atoms). It is implemented within a kinetic- 
theoretic framework, based upon a Boltzmann-like 
equation for the electron velocity distribution: 


e T x) fv.) =- ofw) [5 


where v=|v|, and A is the mean free path of the 
electrons. Of =f — fsph is a projector accounting for 
the effect of collisions through the deviation of the 
distribution / from spherical symmetry, namely 
through the discrepancy between f and its isotropic 
counterpart f,,n(v)=(1/47) | f(v,t)d? obtained as 
an average over the velocity directions v. The 
relevant small parameter is € —77aA/kT, measuring 
the ratio of the work ma done by the field over the 
mean free path to the thermal energy kT in the 
initial state. The condition c«& 1 ensures 
the separation of the characteristic timescales of 
the two mechanisms experienced by an electron: the 
thermal motion and the field-induced deterministic 
motion. Denoting by v4,=V~kT/m the thermal 
velocity of the electrons, we have indeed 
€=(th/tacc)*, where t,,—=Av,, is the mean time 
between two successive collisions with the scatterers 
and tace = \/ A/a is the acceleration time required for 
the field to move the electron over the mean free 
path A starting from rest. The result of the plain 
weak-field expansion is to evidence its own failure: 
it shows that the perturbation is singular insofar as 
the asymptotic state will be fully dominated by the 
field, with no memory of the initial temperature. 
Multiple-scale method is here implemented with 
respect to the time variable, introducing new 
independent variables (7;);,9 such that the physical 
situation corresponds to the line 


2 n 
T) = tU /À, T1 = ETs 72 = E T ox Te = E T 


(e = maA/kT) (3 


The time-derivative expansion [2] is supplemented 
with an expansion of the velocity distribution: 


fly, i) e P t,74,-.0) Ht (LF 


i>0 


The procedure is conducted as exposed in the 
general case. Identifying term-wise the coefficients 
of the expansion yields a hierarchy of equations for 
the (F”);>1, each supplemented with a solubility 
condition preventing the appearance of secular 


divergences. A detailed presentation can be found 
in Piasecki (1993). The benefit of the multiple-scale 
method is to yield jointly the different stages of the 
gas evolution, starting from thermal equilibrium and 
switching on the field at t= 0: 


€ at times T — Ó(1), an initial transient with a drift 
velocity (vz)(t)=at — Cyat*v,y,/A+--- in the 
direction of the applied field (denoting C; some 
numerical constant); 

e at times T= O(1/e), a linear-response regime with 
a steady drift velocity (vz) ~ aA/v4; and 

e at times 7 — O(1/c€), a long-time field-dominated 
heating of the gas, where the velocity distribution 
is no longer Maxwellian, and the kinetic energy of 
the electrons grows without bounds as £”, 
whereas the drift velocity slowly vanishes asymp- 
totically: (v.) ~ (Ma/1)!*. 


Domains of Application of the Multiple-Scale 
Method 


The multiple-scale method was first developed in 
nonlinear mechanics. It is fruitful and is even 
required in any instance where plain perturbation 
expansion is not uniformly convergent, more gen- 
erally when it is necessary to account jointly for 
variations at different timescales: resonant wave 
interactions, for example, in plasmas, or in the case 
of oscillations with slowly varying coefficients. 
Multiple-timescale method was applied, around 
1960, to get kinetic equations (closed equations for 
the one-particle distribution) from molecular 
dynamics (Liouville equation) for dilute gases, 
plasmas, or to establish a microscopic theory of 
Brownian motion from molecular dynamics of a 
hard-sphere system (see the section “Microscopic 
theory of Brownian motion”). In the same spirit, it 
allows to relate constructively different mesoscopic 
descriptions, for example, in the case of Brownian 
motion, to relate the Kramers equation for the 
distribution P(r,v,t) to the Smoluchowski equation 
for P(r,t) (see the section “Mesoscopic theory of 
Brownian motion”). Other examples are the deter- 
mination of transport coefficients (friction, viscosity) 
from kinetic description or, at macroscopic scale, 
the determination of eddy viscosity and eddy 
diffusivity (see the section “Effective diffusivity for 
a passively advected scalar”). A last domain of 
application concerns systems where relaxation pro- 
cesses at different scales superimpose, requiring to 
handle jointly different time dependences. Multiple- 
scale method then displays the physics of the 
relaxation process and its associated hierarchical 
structure (e.g., the application to the adiabatic 
piston problem discussed in this Encyclopedia by 


Gruber and Lesne — see Adiabatic Piston; see also 
the section “Some typical applications"). 


A Brief Overview of Multiscale 
Approaches 


Different Scales and Regimes 


Common to all multiscale approaches is the focus on 
the very existence of different scales, exploited 
through the use of rescaled variables, which makes 
explicit the presence of a small parameter e control- 
ling the dynamics, responsible for the existence of 
different timescales and related to the scale separa- 
tion. Technically, the first, very simple but essential, 
step is to replace the variables, fields, and param- 
eters by their dimensionless counterparts. So doing, 
small parameters reflecting scale separation (in time, 
space, energies, amplitudes,...) will naturally 
appear. Although it is thus possible to estimate the 
order of the different terms, it is to be underlined 
that it gives no clue on their actual contribution to 
the long-term behavior: in singular situations, pre- 
cisely those where multiscale approaches have to be 
developed, small terms can have a noticeable 
influence at all scales. As illustrated in the following 
sections, different rescalings of variables and func- 
tions allow us to discriminate features at different 
scales and to capture different regimes. More 
specifically, the techniques to manage with the 
joint contributions of several regimes at different 
timescales depend on the way these regimes inter- 
mix. They can be: 


e superimposed regimes, when fast and slow depen- 
dences intermingle in the evolution of the same 
variable. It is the framework of multiple-scale 
analysis. The solution writes typically x(t, et, 
ct,...); or j 

* coexisting regimes, namely a coexistence of fast 
and slow evolutions. One might focus either on 
the fast evolution and use a quasistatic approx- 
imation (or parametric approximation) for the 
slow evolution, either on the slow evolution and 
use a quasistationary approximation or an aver- 
aging of the fast evolution. The solution writes 
typically [Xtast(t), Xstow(et)] (or [Xgasc(7/ €), Xstow(T)] 
if the observation takes place at long timescales, 
with a relevant time variable 7 = et); or 

e successive regimes, when initial conditions, bulk 
behavior and asymptotics are not of the same 
order with respect to e; this is a boundary-layer- 
like issue, and the solution writes typically 
Mayer(t/e) for 0 € t € to, then xyu(£) for t > to, 
with fo = O1). 


Multiscale Approaches 469 


Applications are innumerable; the most typical 
and investigated ones are the climate (from “hours” 
for the observed weather to “thousands of years” for 
eras), population dynamics, coasts and sand dunes 
(from “grains” to “country” scales), protein folding 
(the vibration of covalent bonds occurs at scale of 
femtoseconds, while the whole folding may require 
up to a few seconds), or trading markets (from 
seconds to years). Let us finally give two typical 
examples for the parameter c: 


e The weak-damping and high-friction limits, best 
explained on an example. The damped oscillator 
mx + ^x + V'(x) 0 appears as an Hamiltonian 
dynamics mx + V'(x)=0 as soon as the damping 
can be neglected, when the characteristic time 
0 — [m/ V"(0)]? of the undamped oscillator is far 
smaller than the damping time r=m/y. The 
weak-damping limit is thus defined as c — 0, 
where «=0/r=[77/mV"(0)|'/*. It leads to a 
singular behavior when investigating the asymp- 
totics, as in the Duffing oscillator and weakly 
damped oscillator mentioned in the last section. 
On the contrary, the evolution appears as a 
dissipative gradient dynamics x=—V'(x)/y=0 
as soon as 7 < 6. This leads to the high-friction 
limit: 7/@=[mV"(0)/72]'/ — 0. This example 
somehow reconciles conservative and dissipative 
dynamics, showing that they might coexist in the 
same system. 

e The hydrodynamic limit involved in the deriva- 
tion of hydrodynamics equations (namely incom- 
pressible Navier-Stokes equations) from kinetic 
Boltzmann equation. It writes «= A/L — 0, where 
ec is the so-called Knudsen number, defined as the 
ratio of the mean free path A (the average distance 
traveled by a fluid molecule between two succes- 
sive collisions) to a characteristic spatial scale L of 
the system (e.g., the size of an obstacle). 


Bridging the Scales: Mean-Field, Singular 
and Scaling Approaches 


The aim of multiscale approaches is to bridge 
different scales, through the determination of the 
large-scale behavior of the solution, or by establish- 
ing a constructive relation between the initial model 
and an effective model at higher scale. We have 
mentioned in the introduction a first classification of 
multiscale systems and associated approaches: they 
might exhibit (1) scale decoupling, (2) some singu- 
larity in the relation between the different scales, or 
(3) scale invariance. 


Mean-field approaches In case of scale decoupling, 
mean-field approaches apply. Let us briefly recall, 
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within its usual spatial formulation, that a mean- 
field approach amounts to identifying the local 
environment, which is a priori fluctuating and 
spatially inhomogeneous (e.g., the local magnetic 
field generated by neighboring spins in a spin lattice 
model) with the average one, expressed as a function 
of the average order parameter (spatial average or 
equivalently a statistical average in the limit as the 
system size tends to infinity). Mean-field approaches 
can be implemented either in time (averaging), in 
real space (homogenization, coarse-graining), or in 
phase space (aggregation and projection techniques). 

In the present context, the best example of a mean- 
field approach is provided by homogenization proce- 
dures. They can be traced back to the method of 
Lagrange to solve the three-body problem. The issue is 
to describe the motion of a light body B; experiencing 
the gravitational attraction of the Sun and a heavier 
body Bı. The mass of B; is supposed to be small 
enough to neglect its influence on the Sun and B, (the 
so-called restricted three-body problem); B4 will thus 
obey the Keplerian laws of motion. The method of 
Lagrange applies when B; is far more distant from the 
Sun than Bı(r2 >> rı), which implies (due to the third 
law of Kepler: w*r? = const.) that the angular velocity 
w1 of B, is far larger than w2: the large body Bı moves 
faster than B5 around the Sun. In first approximation, 
Lagrange replaced the rapidly oscillating influence of 
B, on the motion of B; by the influence of a constant 
distribution of mass, obtained by spreading the mass 
mı of B, all over its orbit. The Gauss theorem thus 
states that this influence can be accounted for by 
simply adding the total mass of this distribution to the 
mass of the Sun. The stability of the system would 
follow: B2 will remain trapped in the neighborhood of 
the pair composed with the Sun and B4. 


Singular perturbations A typical instance of singu- 
lar multiscale behavior is associated with asymptotic 
expansions 


n—| 


x(t) = X_ ex, + Rnle,t) [8] 


r=0 


which are not convergent: lim, .4, Ry(e,t) ZO at 
e fixed, but lim. ,9 € "R,(c,t) -0 at fixed n and t. 
Asymptotic expansions are ubiquitous in multiscale 
approaches: the coexistence of different timescales, 
superimposed and nontrivially coupled to get rise to 
the observed phenomenon, prevents from obtaining 
uniformly convergent perturbative expansions; it 
is only in this latter regular case that the above- 
mentioned mean-field approaches and homogeniza- 
tion techniques apply. 


Scale invariance, scaling theories and renormalization 
Self-similarity and associated criticality prevent scale 
decoupling, but allow us to develop scaling theories 
and renormalization methods. In contrast to scale- 
separation arguments, the guiding principle is now 
to focus on the links relating one scale to the others 
(scaling transformations, renormalization transfor- 
mations). The problem complexity is thus reduced in 
a some "transverse way," by retaining only scale- 
invariant features. We shall expose in the section 
“Renormalization: an iterated multiscale approach” 
further links between multiscale approaches and 
renormalization methods, beyond the restricted 
scope of scale-invariant systems: in many instances, 
renormalization can be seen as an iterated multiscale 
approach. 


Scaling Limits 


Let us mention a specific instance of multiscale 
approach, which is associated with scaling limits. 
Scaling limit refers to a joint limiting procedure, in 
which several independent variables jointly converge 
towards given limits, with prescribed relative beba- 
viors; this latter condition is a key point in the 
frequent case when the different limits do not 
commute, and we shall see later that it is an 
essential ingredient of renormalization methods. 
Let us cite two acknowledged examples: 


e The thermodynamic limit for a system of N particles 
in a volume V; it amounts to let N — oo, V — oo, 
while N/V —7 — const. (constant average number 
density). It is a prerequisite to derive standard 
thermodynamic behavior from the statistical- 
mechanical description; it supports the use of 
asymptotic results given by the law of large numbers 
and the central-limit theorem provided the correla- 
tions between the particles remain short-range. 

e The Boltzmann-Grad limit for a system of n hard 
spheres of radius € per unit volume. In dimension 
d, it writes e > 0,7 — oo (thus differing from the 
thermodynamic limit) while ne?~!'=z remains 
constant. This limit 1s involved in kinetic theory 
as a limiting instance where the Boltzmann ansatz 
applies (identifying the two-particle distribution 
function with the product of the corresponding 
one-particle distributions). Indeed, the occupied 
volume fraction z^ tends to 0 so that recollisions 
and ensuing long-term correlations can be 
neglected (rarefied gas). On the other hand, the 
mean free path of a particle remains finite, so that 
numerous collisions and associated molecular 
chaos further support the Boltzmann decorrela- 
tion ansatz. 


Stochastic Multiscale Approaches 


Multiscale approaches are far less developed for 
stochastic processes. Let us mention the case of a 
Markov process. Scale separation reflects in a 
spectral gap in the transition matrix generating the 
dynamics. Identification of fast and slow modes is 
then straightforward: slow modes are associated 
with quasidegenerated eigenvalues (A ~ 0 in a time- 
continuous setting), whereas fast dynamics is asso- 
ciated with damped modes and negative eigenvalues 
(A <0,|A| >> 1) (Gaveau et al. 1999). A basic 
difficulty in extending methods developed in a 
deterministic context is the fact that the reduction 
(or projection) of a Markov process is a priori no 
longer Markovian. Closure relations and approx- 
imations should be introduced to circumvent mem- 
ory effects, for example, supported by arguments 
of decorrelation and ensuing fast temporal self- 
averaging of the fast dynamics. 

It is to note that the behavior upon rescaling of a 
stochastic process differs from the transformation of 
a deterministic evolution. The basic relation is the 
scaling upon a time rescaling 0— «t of the white 
noise involved in stochastic differential equations 
and defined from the Wiener process W(t) through 
the relation dW(t)=7/(t)dt. It follows from the 
definition W(0) — W(t) that dW(0) — yedW(t). At 
this point, it is important to notice the difference 
with respect to the behavior of a plain deterministic 
function f(0) —f(t) for which df(@)=edf(t). Using 
the fact that 6(t)=«5(9) and the definition 
dW(0) — 7(0)d0, we obtain that 7(0) is a white noise 
with respect to the rescaled time 60, that is, a 
stationary Gaussian process defined by its first two 
moments 


((8) —0, X (m(0)j(0) = 40-8) X [9] 


Slow/Fast Variables 
Slow/Fast Decomposition 


Dynamics of systems made of many interacting 
elements, for example, chemical reactions, or popu- 
lation dynamics, typically involves far too many 
degrees of freedom to be handled at the level of 
individual units, and requires a drastic reduction to 
make sense of it. A natural way of reduction is based 
upon the phenomenology, taking as relevant degrees 
of freedom those describing the slow evolution 
observed at macroscopic scales. Scale separation 
between microscopic and macroscopic worlds has to 
be turned into a constructive and quantitative 
argument to achieve this reduction. 
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Solving this typical multiscale issue first requires 
to identify and construct explicitly tbe slow vari- 
ables, for example, collective variables obtained 
through aggregation or coarse-grainings. The second 
step is to eliminate or ratber integrate the fast 
dynamics into a closed system of effective equations 
describing the large-scale evolution. The closure 
requirement generically involves an approximation, 
neglecting the remaining dynamic coupling between 
fast and slow variables. It is precisely here that scale- 
separation arguments and the very choice of the 
slow variables are crucial, ensuring that the influ- 
ence of fast dynamics is essentially accounted for in 
its effective or average contribution to the slow 
dynamics; remaining fluctuating influences can be 
either neglected or included in a noise term, required 
to be fully determined as a function of the slow 
variable only (otherwise the whole procedure would 
neither be consistent nor useful). In the following 
subsections, we shall briefly present the main 
techniques allowing to achieve this program, con- 
sidering the simple abstract system: 

4X _ (X,Y), eg, Y) (e«1) [10 
dt dt 
Although involving only two variables for simpli- 
city, it exhibits the typical multiscale structure: 
whereas X varies on scales O(1), Y appears as a 
slow variable of characteristic timescale O(1/e). 


Parametric Approximation 


The preliminary step of the reduction is to get some 
knowledge on the fast dynamics, at least to choose 
the proper multiscale technique. A plain but never- 
theless fruitful remark is that a parameter p can 
always be seen as a variable that does not evolve: 
dp/dt=0 in a deterministic setting, or Wp—=q = 
ó(p — q) in a stochastic one (transition probability 
W). Conversely, a slow variable can be transiently 
treated as a mere parameter in the fast dynamics. 
Supported by timescale separation, this parametric 
approximation (or quasistatic approximation) 
decouples the fast dynamics from the slow variable 
evolution, investigating the fast dynamics asympto- 
tics (t — oc) while considering that the slow variable 
remains constant Y(t) = y. In the following, we shall 
distinguish two cases: (1) the fast dynamics oscillates 
with a period T « 1/e, and (2) the fast dynamics 
relaxes to a stable equilibrium point X*(y) slaved to 
the slow variable. 


Amplitude Equations 


A ubiquitous technique to account for slowly 
modulated oscillations has been introduced first by 
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Fresnel for light propagation and optical phenom- 
ena. The basic idea is to take benefit from the scale 
separation between the fundamental oscillation 
(frequency w, wavelength A=27/k) and a super- 
imposed slow variation of the wave amplitude 


A(r,t) — A(r, tel ren) 


[11] 
K=|VA/A| « k, 


Q = |0,A/A| « w 

The evolution can be rewritten in terms of the 
slowly varying amplitude A; by construction, it is 
ruled by terms involving the small parameter e~ 
K/k ~ Q/w « 1, but the resulting equation is now 
devoid of small or large parameter. Such technique 
has been successfully applied and further developed, 
for example, in various situations involving electro- 
magnetic waves (e.g., diffraction of Hertzian waves), 
in plasma physics (resonant interaction between 
electromagnetic waves and acoustic modes) and in 
quantum mechanics, to investigate the deformation 
of a wave packet in a potential. 


Averaging 


Let us discuss further, in a general setting, the case 
when the fast dynamics is an oscillation of period T 
(either linear modes as in the last subsection or a 
stable limit cycle). It is a context where averaging 
techniques apply. We refer to the associated entry in 
this Encyclopedia by Neishtatdt (see the article 
Averaging Methods) and only mention here the 
main principle: to exploit scale separation and self- 
averaging property of the fast dynamics to replace 
X(t) by an average value 


T+t 


Xayv(t) = (1/T) X(s)ds 


[ 
The underlying idea is that averaging cancels out 
most of the fast variations so that X,,(t) is now 
slowly varying. In case when the fast dynamics is 
influenced by the slow variable Y, its value is kept 
constant in the averaging (see the section “Para- 
metric approximation"). The resulting average 
behavior X,y[Y(t),¢] is reinjected in the evolution 
of the slow component, leading to a closed equation, 
dY 
di e g( Xa, [Y(t), t], Y) 
or rather [12] 
dY = oy o. 
3-7 gw (n), 21, Y) 
- 
in terms of the more relevant rescaled time variable 
Tr —et and Y(7) = Y(t). Denoting Y(7) the solution 
of this approximate equation, the validity of the 
averaging procedure is assessed by theorems 


giving conditions ensuring that limo Y.(7) = Y(7). 
Note that such theorems (quite unusually) state 
the convergence, for a vanishing value of the 
perturbation parameter e, of the exact solutions 
towards the approximate one (solution of the 
average equations). 

To conclude, let us notice that one speaks of 
averaging in temporal context and homogenization 
in spatial or spatio-temporal contexts, when aver- 
aging is performed over space; as discussed in the 
section “Bridging the scales: mean-field, scalar, and 
scaling approaches," averaging and homogenization 
belongs to the general class of mean-field 
approximations. 


Quasistationary Approximation 


Let us now consider the case when the fast dynamics 
converges at fixed Y towards a stable fixed point 
X*(Y). Focusing on the slow dynamics, the relevant 
time variable is t=et, which turns the evolution 
[10] into 


dX dY | 
eS =f(%Y), =X Y) — (3 


(for the sake of simplicity, we use the same notation 
X for both X(t) and X(7)). It is solved in two steps, by 
noticing that at lowest order in e, the fast dynamics 
reduces to the asymptotic regime f (X, Y) = 0, slaved to 
the slow variable Y. The corresponding stable state 
X*(Y) is then plugged into the slow dynamics to get a 
closed equation for Y(7): 
e. - XQ). Y] = GQ) 14 
This achieves the desired dimensional reduction. It 
works equally well when X is a string of variables 
Me [py EN UA 
There is seemingly a paradox here, ubiquitous in 
many multiscale approaches: in order to determine 
the evolution of the slow variable Y, it is considered 
a constant! The solution lies in scale separation: the 
trick is to consider the ensuing approximate decou- 
pling as an exact one (what it would be in the limit 
€e — 0). In other words, the constancy of Y is 
considered over a time length which is long at the 
level of fast dynamics (At > 1), long enough for X 
to reach its equilibrium state X*(Y), but short at 
the macroscopic level (ceAt — Ar <1). As in the 
so-called “quasistatic evolutions” encountered in 
thermodynamics, the large-scale evolution will be 
composed of a continued succession of local 
equilibrium states: at each time 7,X takes its 
instantaneous equilibrium value, slaved to Y{(r). 
Here one speaks equivalently of quasistationary 


approximation, quasisteady-state approximation, or 
adiabatic elimination of fast variables. 


Slow Invariant Manifolds 


In the previous subsections, the decomposition 
between fast variables X and slow variables Y was 
given. But in practice, only the whole dynamics of 
the system is known and a main part of the issue is 
to find and construct explicitly the slow variables. 

A geometrical viewpoint on the dynamics 
appears to be fruitful: if the system evolution is 
to be reducible to the evolution of a few degrees 
of freedom, it means that the flow essentially lives 
in a low-dimensional region of the phase space, 
which can be parametrized by these degrees of 
freedom up to some fuzziness of order O(e). 
Mathematical investigations have been conducted 
to assess this point, leading to the concept of 
invariant slow manifold: a manifold M of the 
phase space, invariant upon the dynamics and 
describing the slow dynamics once the system has 
reached it (Gorban et al. 2004). Starting from an 
arbitrary point zo, the trajectory first exhibits a 
fast transient bringing the system state close to M, 
up to some tolerance of order O(c), then sticks to 
M. Its evolution on M is ruled by a reduced 
dynamics, far slower than the fast relaxation to 
M as soon as the system actually exhibits a 
timescale separation. This latter self-consistent 
assertion should be considered as a working 
hypothesis, to be validated by the explicit deter- 
mination of M and associated reduced dynamics. 
This can be done numerically, by exploiting the 
presumed convergence property of any trajectory 
reaching M after some intrinsic transients. In 
other words, if the dynamics possesses a slow 
invariant manifold, an operational way to find M 
is to let the system evolve, starting from a sample 
of initial conditions, and to observe its stabiliza- 
tion on M. 

This framework obviously embeds the quasista- 
tionary approximation presented in the last subsec- 
tion: in this case, the slow invariant manifold is 
M={z=(x,y), f(z)-0]- ((x*(y),y)] and the dynamics 
restricted to M is the slow dynamics dy/dr= 
Gly(7)], x(7) 2 x*[y(7)]. Here the manifold is invar- 
iant upon the approximate dynamics (for all 
t, f[z(t)] - 0, hence z(t) € M) but not upon the 
original one: some rigorous mathematical work has 
to be done to show that the actual dynamics keeps 
the trajectory in a proper neighborhood of M of 
width O(c). In other words, one has to control the 
discrepancy between the exact trajectory and the 
trajectory slaved on M. 
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Central Manifold 


The notion of slow invariant manifold generalizes 
older results about central manifolds, exploited to 
reduce the dynamics near a bifurcation point. Let us 
consider a dynamical system x—/f(x,o) near a 
bifurcation point: in a=a,, the fixed point xo, 
stable for o < œe, loses its stability. This reflects on 
the largest eigenvalue(s) of the stability matrix 
Df (xo,o), namely Ay(a) < 0 for a < ac, àla) > 0 
for a > o, and A1(a,) 20. The small parameter is 
then e — A41. A main result was to show that, near the 
bifurcation point, slow modes coincide with 
unstable directions and fast modes with stable 
directions (Haken 1996). The decomposition into 
slow and fast variables is ruled by the central 
manifold theorem: the solutions can be expressed 
in terms of the amplitudes along the eigenvectors of 
the null space of the dynamics at e=0; these 
amplitudes appear as the relevant order parameters 
near the bifurcation. This is referred to as the slaving 
principle. Compared to the setting presented in the 
subsection *Slow invariant manifolds," the slow 
invariant manifold M is given here by the central 
manifold. 


Projection Techniques 


The methods presented in the previous subsections 
to eliminate fast variables and construct a reduced 
slow dynamics can be unified into a common 
framework: Mori-Zwanzig projection techniques. 
The full state (x, y) of the system is projected onto 
the slow variable y and the functions w(x, y) are 
projected onto their conditional expectation 


Pwi = J w(x, y)p(x|y)dx [15] 


The core of the method lies in the choice of 
conditional distribution p(x|y), for instance, 
p(x|y)—ó(x—x'(y) in case when there is an 
invariant manifold x=x*(y), or p(x|y)=1/27 in 
case of averaging over a rapidly varying phase x. We 
refer to Givon et al. (2004) for a review. 


Aggregation Techniques and Coarse-Grainings 


An intuitive guideline in the analysis of a multiscale 
dynamics is that collective variables or coherent 
states coincide with slow modes. The rationale is 
that numerous fast fluctuations at the level of agent 
dynamics self-average, so that only a slow trend is 
perceptible at large scale. Aggregation methods have 
been developed in this spirit to build reduced models 
governing the slow dynamics. Nevertheless, in 
generic situations, aggregation does not lead to 
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closed equations for the collective variables and 
some level of approximation has to be introduced. 
Let us now consider a system of N coupled 
degrees of freedom, [x;(2)]j ..,. N (e.g., a system of N 
interacting agents) evolving deterministically accord- 


ing to a two-scale dynamics (Auger and Bravo de la 
Parra 2000): 


dx; 
er fie ox) + Bi(%1,--- Xn) [16] 


where f describes a fast evolution due to the 
coupling between species and g; a slow evolution 
due to internal mechanisms. A natural choice for the 
slow variable is Y(x1,...,x,) = J äis but we shall 
write below the general case. The self-consistent 
requirement of the method is that this variable Y 
reflect a global and slow behavior. Considering £ as 
a fast time variable, this condition amounts to 
require a quasistatic behavior for Y at this timescale. 
In other words, the consistency condition requires 
that there exists a manifold Fy, such that 


on Fy = {Y lt N) = y [17] 


We, moreover, assume that the fast dynamics on this 
manifold F, leads to a stable equilibrium 
(xi(y)....xw(y). We are then in a position to 
describe the slow evolution of the manifold itself, 
that is, the slow dynamics ruling the evolution of the 
aggregated variable y for e small enough: 


dy OY e * 


xg[x0)....xx0)]-- O() — [18] 


Internal support of the procedure is to check the 
structural stability of this resulting aggregated 
dynamics. Compared to the quasistationary approx- 
imation and slaving principle presented earlier, here 
the slow variable is not given independently but 
constructed as a function of the fast variables 
(aggregated variable). The same principles can also 
be implemented for discrete-time models. 

Coarse-graining can be seen as the spatial analog 
of aggregation techniques developed in the phase 
space: the real space is split into cells considered as 
elementary units at macroscopic scale, and all the 
small-scale physics is averaged over each cell, 
yielding the apparent state of each unit (described 
by a few “coarse-grained” variables) and the 
effective interactions between them. 

Let us cite two hydrodynamic examples. Eddy 
viscosity refers to an effective viscosity involved in 


coarse-grained hydrodynamics equations; the con- 
tribution of small-scale turbulent structures is 
accounted for in an integrated way in this para- 
meter, hence its name. It is typically lower than bare 
viscosity, even possibly reaching negative values at 
large enough Reynolds number, that is, at low 
enough bare viscosities. Cellular flows are space- 
periodic flows, thus exhibiting a natural spatial 
scale: the coarse-graining amounts to an intrinsic 
homogenization over each cell of the flow. 

Let us finally mention that coarse-grainings are 
involved in renormalization-group transformations 
once supplemented with the adequate rescalings (see 
the section “Renormalization: an iterated multiscale 
approach”). 

In conclusion, it is to note that all these various 
multiscale approaches are closely related and can all 
be expressed as a specific projection technique in the 
extended phase space containing both fast and slow 
variables. For instance, aggregation techniques 
replacing the fast variables (x1,...,x,) by the slow 
collective variable y= Y(x1,...,x,) amount to the 
projection technique involving the slow invariant 
manifold M = ((x1,...,x4,y) | y= Y(x1,...,x4)]). 


Numerical Aspects 


In the community of applied mathematics, multi- 
scale methods refer specifically to numerical homo- 
genization, involving multigrid algorithms as, for 
instance, multiscale finite-element method, multigrid 
Monte Carlo, multigrid optimization, or annealing. 
Basically, the idea of numerical homogenization is 
to avoid the numerical cost of using a mesh of size 
b «e, where e is the scale of the smallest-scale 
features of the dynamics, and to use jointly: 


* a fine mesh, to compute local quantities indepen- 
dently (hence with a parallelized program); and 
® a coarse mesh, to compute global behavior using 
effective parameters and homogenized quantities 

determined in the prior fine-mesh computation. 


We refer to Gorban et al. (2004) for a review. 


Boundary Layers and Matched 
Expansions 


Purposes and Principles 


Multiscale approach to handle boundary layers was 
introduced in 1905 by Prandtl in fluid mechanics for 
situations where the solution of hydrodynamics 
equations far from the boundaries (“bulk” solution) 
does not match the conditions at the surface of the 
walls or obstacles. This typically originates in the 
presence of a multiplicative small factor e in front of 


the highest-order derivative; accordingly, the flow 
exhibits two different scales in space: a thin 
boundary layer of width controlled by « and the 
bulk domain. The idea is to perform two different 
perturbation methods in the layer and in the bulk, 
involving a different rescaling in order to focus on 
and give the ruling place to either the boundary 
conditions or the bulk dynamics (one also speaks of 
inner and outer expansions). Then these parallel 
perturbation expansions have to be bridged into a 
single global continuous solution. The matching 
principle is to identify the asymptotic behavior on 
the boundary side with the boundary condition of 
the bulk behavior (Nayfeh 1973): 


lim Xpulk(7) = lim Ajaver(G) with C= r/e [19] 
r (—00 


Boundary layers of hydrodynamics have numer- 
ous analogs: initial layers in chemical kinetics, skin 
layers in electrodynamics and edge layers in solid- 
state physics (Nayfeh 1973). Adaptation of this 
technique is to be developed to determine the 
complete dynamics in the slow-invariant-manifold 
approach, matching the fast relaxation towards the 
manifold with the slow motion onto the manifold. 
Let us finally note that the matched-expansion 
approach can benefit in each region of all the 
above-mentioned multiscale techniques. 


Time Analog: Implementation for Initial Layers 


We shall now work out the time analog of a 
boundary-layer problem on the abstract example 
encountered in [10], in the case when X rapidly 
evolves to a slaved equilibrium state X*(Y) but with 
initial conditions Y(0)=yo and X(0)=xo Z X* (yo). 
Obviously, the quasistationary approximation fails 
to describe the initial regime and its applicability 
has to be reconsidered. The general principle of 
boundary-layer analysis, namely the recourse to two 
different perturbation approaches, is implemented as 
follows: 


e For the initial regime, one solves the fast 
dynamics with initial conditions X(0)=xo while 
keeping Y(t) = yo; this yields an approximate 
solution [Xjayer(t), Yiayer(t)], satisfying the initial 
conditions and valid at short times, as long as Y 
has not evolved. 

e At longer times, the relevant variable is the 
rescaled time t=et and the quasistationary 
approximation described in the last section 
applies. 


The consistency of the two perturbative 
approaches is ensured by the matching conditions 
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lim Xputk(7) = lim Xjayer(t) 
PM fan [20] 
lim Yputk (7) = lim Yiaver (t) = yo 


These conditions are actually satisfied since Xpuk(T) = 
X*[Yputk(7)], hence lim, — o Xpuk(T) = X" (yo) and, by 
definition of X* (at fixed Y(t)=vyo), lim; ə 
Xlayer(£) mu X" (yo). 


Some Typical Applications 


Enzymatic catalysis A matched singular perturba- 
tion approach is currently encountered in chemical 
systems, for instance, in the derivation of the 
Michaelis-Menten kinetics for a single enzyme and 
the Hille cooperative kinetics for an allosteric 
enzyme (Murray 2002). Denoting by E the enzyme, 
by S the substrate, by ES the active complex, and by 
P the product, the single-enzyme catalytic transfor- 
mation of S into P is described by the following 
scheme: 
k Real 

S+E=ES—>P+E 

k' [21] 


S] =s, [|E]=ķe, [ES] =c 


where, as is well known, the enzyme is released at 
the end. Introducing dimensionless quantities 


2 "EM " C 
t= keot, Sz—, cz— 
SO 0 
k' +k, 
Kn=—> = Kg = — [22] 

SQ 

NEST = 

— kso’ SO 


the corresponding chemical kinetic equations can be 
written as 


e =i He +. Ka — X) 
: [23] 
egag) =s$—c(s+Kn) 


Noticing that «<1 (the enzyme is present in 
infinitesimal quantities compared to the substrate), 
a quasistationary approximation applies for the 
variable c: it means that the intermediary species 
ES rapidly reaches a local equilibrium state c = c*(s). 
This yields the substrate evolution 


di A 

dí s Kg 
The initial condition is set only on the substrate: 
s(0) — so, that is, s(0) — 1. It yields the well-known 
expression of the velocity V =(ds/dt),9 as a 


function of the initial substrate concentration: 
V(so) = e0Rearso/(So + Km) (with a maximal value 


i24] 
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Vmax =€okcat). The quasistationary value for the 
complex (dimensionless) concentration c'(s— 1) — 
1/(1 + Km) at t —0 obviously differs from the actual 
initial condition c(0) — 0: besides, it is quite foresee- 
able that the transients leading the complex ES to its 
stationary value cannot be described using a 
quasistationary approximation. At short times, the 
relevant time variable is the fast rescaled time 
0 — t/e, leading to the equation describing the initial 
regime when supplemented with the actual initial 
condition c(0) —0,s(0) — 1. The analysis is straight- 
forwardly carried over, exactly as in the general 
abstract case, with a matching condition lim, 


Kinetic theory Time-matched expansions have 
been developed in kinetic theory, for instance, to 
describe the fate of a tagged particle within a gas. In 
a first, short stage (kinetic stage) following the 
injection of the particle in the thermally equilibrated 
gas, the velocity distribution of the particle rapidly 
evolves due to collisions with gas molecules and 
associated momentum transfer. This stage lasts a 
few mean-free-times and it ends when the tagged- 
particle distribution is almost Maxwellian. Then, in 
a second stage (hydrodynamic stage), the distribu- 
tion slowly relaxes towards a spatially uniform 
distribution, ultimately equal to the equilibrium 
Maxwell-Boltzmann distribution; at each time, the 
velocity distribution is almost Maxwellian. The 
particle dynamics is described at the level of its 
distribution function by the Boltzmann equation, 
and the resolution (the so-called Chapman-Enskog 
method) is based on the above general principles. 


The adiabatic-piston problem A matched two- 
timescale perturbation approach has been developed 
for the adiabatic piston problem: an isolated cylinder 
filled with an ideal gas (noninteracting light particles 
of mass 71) is separated in two compartments by a 
moving piston, of mass M, adiabatic in the sense that 
it has no internal degrees of freedom and does not 
conduct heat when fixed. The small parameter is the 
mass ratio c — 272/(M + m). It quantifies the effi- 
ciency of energy transfer between the gas particles 
and the piston upon elastic collisions, and the 
strength of the indirect coupling of the two gas 
compartments through the collisions of their particles 
with one and the same piston. The matched 
perturbation approach gives access both to a fast 
deterministic relaxation towards mechanical equili- 
brium, at timescales O(1), with no heat transfer 
between the compartments, and a slow fluctuation- 
driven evolution towards thermal equilibrium, where 
the heat transfer is achieved by the collision-induced 


coupling between the gas and the piston fluctuating 
motion, thus occurring at timescales O(M/m) (see 
Adiabatic Piston). 


Renormalization: An Iterated 
Multiscale Approach 


It is not the place to expose or even summarize the 
implementation of renormalization techniques, for 
which we refer to the associated entries in this 
Encyclopedia. Here we will only stress the natural 
relations between renormalization group (RG) and 
multiscale approaches. The RG approach indeed 
shares many steps and guiding principles: joint 
rescalings, coarse-grainings and local averaging, 
effective parameters and effective terms, relevant 
and irrelevant contributions, with a focus on large- 
scale behavior. Moreover, far beyond the scope of 
the study of critical phenomena, RG has been 
extended into an iterated multiscale approach 
allowing to determine in a systematic and construc- 
tive way the effective equation describing the 
universal large-scale features and asymptotics of a 
multiscale system (see, e.g., Chen et al. (1996) and 
Mazzino et al. (2004). 

It is first to be underlined that different meanings 
are associated with the term “renormalization,” 
corresponding to very different statuses for the 
associated renormalization procedures. 

A renormalized quantity can be plainly a rescaled 
quantity (normalized, dimensionless or put to the 
scale of the considered sample): here arises a first 
connection with multiscale approaches, both involv- 
ing rescalings as an essential preliminary step. 

A renormalized quantity can be an effective 
quantity accounting in an integrated way of com- 
plicated underlying mechanisms (e.g., the renorma- 
lized mass of a body moving in a fluid, accounting 
for hydrodynamic effects); here arises another 
central notion of multiscale approaches: effective 
parameters or effective equations (following, e.g., 
from averaging or homogenization). 

Renormalization is also a mathematical technique 
developed first in celestial mechanics, and then 
mainly in quantum electrodynamics to regularize 
divergent expansions and perturbation series. It 
might proceed by means of resummation; the idea, 
implemented by Rayleigh in 1917, is to sum up 
correlations and interactions into a redefinition of 
the parameters. It might either rely on the introduc- 
tion of a cutoff in the space, time, and energy scales, 
then accounting in an effective way of the host 
of contributions at smaller space and time scales 
Ax < A, At < 0 (or, equivalently, larger momentum 


and frequency scales: k > 27/A,w > 27/0) so as to 
take advantage of the physical cancellation. of 
mathematical divergences. In any case, it turns the 
bare parameters of the original singular expansion 
into renormalized parameters and yields a renorma- 
lized regular expansion. Writing that the resulting 
large-scale behavior does not depend on the chosen 
cutoff (A,0) yields renormalization | equations, 
expressing quantitatively the very consistency of 
the procedure (“renormalizability” of the expan- 
sion). Renormalization provides alternative technical 
tools in instances treated above with the multiple- 
scale method. Its main advantage is its recursive 
structure: introducing a sequence (A,, 0,),, of cutoffs 
(what is called momentum-shell RG), the whole 
procedure can be iterated to integrate recursively the 
influence of small-scale features on the asymptotic 
behavior, allowing as to handle situations exhibiting 
a hierarchy or even a continuum of scales. 

Renormalization also refers to an asymptotic 
analysis allowing as to classify critical behaviors, to 
determine quantitatively the critical exponents and to 
handle the associated divergences. Indeed, the above- 
mentioned multiscale approaches fail near bifurcation 
points or critical points. In this case, scale separation is 
replaced by scale invariance. The key idea, underlying 
RG techniques is to shift the focus on the scaling 
procedure itself. The basic point is to construct a 
renormalization transformation, consisting in joint 
coarse-grainings and rescalings, thus relating the two 
models describing the same phenomenon at different 
scales (Lesne 1998); it puts forward their self-similar 
properties and associated scaling laws, while eliminat- 
ing specific small-scale details having no consequences 
on the asymptotic, large-scale behavior. The set of 
renormalization transformations has a semigroup 
structure with respect to the rescaling factor (or plainly 
with respect to iteration) justifying to speak of RG. It 
generates a flow in tbe space of models, whose fixed 
points correspond either to trivial or to critical 
situations according to their stability. It can be shown 
that the linear analysis of the renormalization trans- 
formation around a critical fixed point gives access to 
the critical exponents. Moreover, this analysis allows 
us to split the space of models into universality classes, 
each associated to the basin of attraction of a critical 
fixed point. Let us emphasize that scale invariance 
leads to a deep change in the modeling and investiga- 
tions, shifting from a “physics focusing on the 
prediction of amplitudes" to a "physics of the 
exponents," focusing on less specific, but more 
universal and above all, more intrinsic features. 

Far more generally, RG is associated with a 
qualitative change in the questioning, since the 
study takes place in a space of models. Generalized 
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renormalization transformation can be designed to 
extract not only self-similarity properties but any 
large-scale feature from a more microscopic model. 
In particular, RG can be specially designed to 
discriminate between essential and inessential terms 
in a model: the latter do not modify the asymptotics 
of the RG flow, meaning that they are of no 
consequence at large scales. In other words, generic 
properties of the renormalization flow in this space of 
models yield universal large-scale scaling properties. 
RG is thus essentially a multiscale approach, insofar 
as it only retains the relations between the different 
levels of descriptions, somehow ignoring the details at 
each given scale. It is actually designed to capture 
universal features of the multiscale organization. 


Summary: The Exemplary Case 
of Diffusion 


Bridging the Scales 


Our aim in this section is to present the whole range 
of multiscale approaches in use, allowing both to 
bridge models devised at different scales and to 
predict the large-scale features of the phenomenon 
they account for. We choose the context of diffu- 
sion, Brownian motion, and transport phenomena, 
where such a bridge is essential and has been much 
investigated. Indeed, transport coefficients are 
defined through phenomenological equations; it is 
thus necessary to relate such macroscopic equations 
with smaller-scale theories, so as to get an expres- 
sion of the coefficients in terms of the microscopic 
ingredients and to justify the validity of the 
phenomenological description. 

The exposition in the various subsections below, 
following increasing scales, will mark out the path- 
way from reversible molecular dynamics to macro- 
scopic diffusion equations. We shall thus come 
across the multiple-scale analysis of the Liouville 
equation describing at microscopic scales a Brown- 
ian grain suspended in a thermal bath of water 
molecules (see the next subsection) leading to the 
mesoscopic Kramers equation for the grain distribu- 
tion function P(r,v,t). Next, involving higher but 
still mesoscopic scales, we see that another multiple- 
scale analysis leads to the reduced Smoluchowski 
equation for its spatial distribution P(r,t). Random 
walks offer alternative mesoscopic models, involving 
effective diffusion coefficients in order to take into 
account underlying features like persistence length 
or other short-range correlations. Scaling limits or 
more systematic renormalization methods in real 
space allow to bridge discrete random-walk models 
with continuous descriptions. Another RG, based on 
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a path-integral formulation in the framework of 
field theory, allows to handle the case of self- 
avoiding walks with infinite memory. Homogeni- 
zation is illustrated on the case of diffusion in a 
regular porous medium, whereas diffusion pro- 
cesses in fractal substrates provide a counterexam- 
ple, singular enough to exhibit anomalous scaling 
behavior. The issue of reducing the dynamics of the 
diffusion process to a simpler effective one is 
encountered in many other macroscopic instances, 
among which we shall mention diffusion in a 
periodic medium, lending to space averaging, and 
advection of a passive scalar field in a two-scale 
velocity field, where a multiple-scale analysis yields 
the effective diffusivity at large scale. We shall give 
further technical guidelines for constructing these 
steps climbing from molecular up to large macro- 
scopic scales, thus providing additional illustrations 
of the multiscale approaches introduced in the 
previous sections on more general and abstract 
grounds, 


Microscopic Theory of Brownian Motion 


The first theoretical account of Brownian motion, 
namely the erratic movement of a micron-sized 
pollen grain suspended in a thermal bath, for 
example, water, dates back to 1905 and the famous 
paper by Einstein. It took almost 60 years before a 
microscopic theory was achieved; this theory has 
been further worked out using multiple-scale 
techniques (Cukier and Deutsch 1969). The chal- 
lenge is to start from the complete deterministic 
reversible dynamics of the system, described within 
a probabilistic framework by the Liouville equation 
Op/Ot=Lp for the distribution of probability p in 
the whole phase space (position and velocities of 
the grain, of mass M, and all water molecules, of 
mass m « M). The small parameter is the mass 
ratio c— J/m/M measuring the efficiency of the 
energy transfer upon collisions between the grain 
and the bath particles, assuming a binary interac- 
tion potential U-—97;u(|r; ^ rl). The Liouville 
operator is decomposed into L-—Lo--«L;, and 
one introduces rescaled time variables 7,-— €" f, 
where 70 —t is the timescale of the fluid particle 
dynamics. Multiple-scale method is carried out 
according to the general scheme, leading to the 
so-called Kramers equation, 


o o 
E +i x) P(r,v,t) 


= Č L EIL v, t) [25] 


- 
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where the friction coefficient is explicitly given as 
1 o 
GST 
3MET Jo 
where F; = e"^F, and Fo = -V,U [26] 


We refer to the original, although very pedagogical, 
paper by Cukier and Deutsch (1969) for a thorough 
exposition and discussion of this derivation. 


Mesoscopic Theory of Brownian Motion 


Multiple-scale method is also of relevance to 
determine the high-friction limit of the above 
Kramers equation. Standard perturbation technique 
with respect to the inverse of friction, 1/¢, fails to 
describe the asymptotic regime: there is not enough 
freedom to fulfill all the solubility conditions 
required to avoid the appearance of secular diver- 
gences (Bocquet 1997). By contrast, multiple-scale 
technique yields a uniform expansion of the evolu- 
tion equation still valid at long times, thus allowing 
to bridge two mesoscopic levels of description, 
namely the Kramers equation and the Smoluchowski 
equation for the spatial density p(r,t) of the 
Brownian particle: 


) | ð O 
2; pisr,t)- Mc or (erg) p(r,t) [27] 


Introducing dimensionless variables 7 — tv /T, R= 
r/l, V =v/v,,, where l is the size and v = /RT/M 
the thermal velocity of the grain, the relevant small 
parameter appears to be the dimensionless inverse of 
the friction coefficient, e = v, /I6; hence, 


o Ó 
(5. +V. jk)" (R V.n 


0 0 | 
= oy v + A P(R,V,7T) [28] 
If the friction is high (ie. c«€ 1), the velocity 
relaxes very rapidly towards the equilibrium Max- 
well distribution, and it is then enough to describe 
the (slow) evolution of the spatial distribution 
p(r,t). Nevertheless, the relaxation stage is essential 
and accordingly the c-dependence is singular, as a 
rule when the small perturbation parameter multi- 
plies the time derivative. 

According to the general procedure exposed in the 
section *Multiple-scale method: principles," we intro- 
duce rescaled variables To= 7,71 = ET, T? = €^7,... 
considered as independent variables and look for a 
solution of the Kramers equation of the form 
P= p + cep + ep?) +.--, where the arguments 
of all the components P" are (R,V,70,71,72,---): 
Identifying term-wise the successive powers of c yields 
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a hierarchy of equations. At order 0, we obtain 
P = (R,7,71,72,--.)e /2. The following equa- 
tions, for the [P")];.,, involve the linearized operator 
£ — Oy(V + Oy). For each of them, there appears a 
solubility condition, requiring that none of the additive 
contributions in the equation is an eigenvector of L; 
involving the components P! with j < i, it prevents the 
appearance of a secular divergence in P, At order’, 
the solubility condition is 0/079 = 0, thus determin- 
ing the (trivial) ty-dependence of P), In a similar way, 
the solubility condition at order 2 allows to determine 
the 7;-dependence of P, This bridges the Kramers 
and Smoluchowski equations in the high-friction limit, 
when retaining only the first-order term in e. We refer 
to Bocquet (1997) for a pedagogical account of the 
derivation and discussion of its relation with the time- 
derivative expansion involved in the so-called Chap- 
man-Enskog solution of the Boltzmann equation. 


Random-Walk Model and Weakly 
Correlated Diffusion 


Random walks are discrete-time mesoscopic models, 
accounting for the diffusing motion of a particle 
through the statistical properties of its successive 
steps, when observed at a given timescale 7. The 
basic model (ideal random walk) assumes isotropic, 
independent and identically distributed steps of var- 
iance 4^. Central-limit theorem straightforwardly 
gives the time dependence of the mean-square dis- 
placement R(t) = (|r(t) — r(0) )— aà2t/r, showing 
that the motion is a normal diffusion, with diffusion 
coefficient D = a? /2dr in dimension d. It is to note (see 
also the next subsection) that D depends 7 and a, but in 
a joint manner. Actually, the diffusion coefficient 
associated with a diffusive motion observed at scale a 
and modeled by a random walk on a lattice of 
parameter a can be written as D = aa’, where the rate 
a depends on a (effective rate at spatial resolution a): 
this is a sort of renormalization that accounts for the 
rate a(a) of all microsteps backward and forward of 
length far smaller than a. 

In case of short-range correlations between the 
successive steps (namely if $`% |C(t)| < oo, where 
C(t) is the statistical correlation function between 
elementary steps separated by a time length 7), direct 
computations support a time-average-like result: 
the asymptotic behavior is still described by a 
normal diffusion law R^(t) ~ 2dD,gt, with Deg = 
DES, C(t). When C(t) 2e-*/* 


D(1 4- e 1/7) 


PE 1 — wt 


hence Dog & 27D if 7 « 1. 


Renormalization Analysis in Case 
of Markovian Diffusion 


Trying to bridge lattice random walks with a 
continuous description brings out the following 
difficulty: as the step size a goes to 0, one has to 
obviously decrease the duration 7 accordingly, but 
by what amount is not so obvious, since the walker 
velocity is ill-defined (it depends on the observation 
scale). Determination of the proper joint rescaling 
can be guessed from the knowledge obtained by 
another mean about the system; rather, it can also 
be obtained in a systematic way, thanks to RG 
methods. Let us explain the basic principle. 

Let us denote by P; (x, y,t) the transition prob- 
ability governing the random walk, namely the 
density of probability to jump from x to y in time 
t, where x, y are restricted to the lattice (aZ) and 
time to TN. The renormalization transformation 
o, a should express the consequence for Pa,- of a 
joint rescaling of space (by a factor of k) and time 
(by a factor of k^). Taking into account the Markov 
character of the walks, we are thus led to define 


I, a Par|(xsy; t) = k^ P, (Rx, ky, k“t) 


in dimension d [29] 


The proper value of œ is to be determined self- 
consistently in order that the limit lim, .4, ®g 4 Pa; 
exists (it is then a continuous transition probability 
P*(x,y,t) defined on R? x R? x R). The root- 
mean-square displacement 


1/2 
R(P, t) = » ~ y P(x. y, ] 
x,y 


is transformed according to 
R(®p.Part)=kIR(P, kt) BO 


Accordingly, it yields the diffusion law associated 
with the fixed point P*: 
for any k, R(P*.t) =k 'R(P*,k°t), 


QQ? 


hence R(P*,t) ~ t'/° [31] 


It is anomalous except if ~=2. In the case of ideal 
random walks, the proper exponent leading to a 
nontrivial limit is œ — 2; this limit P5 is the transition 
probability of a Wiener process: 


Wp(x, y, t) = [41dDt] 4/7 e- 6» /Adb: 
with D — a° [2dr i32] 


This shows that all ideal lattice random walks 
belong to tbe same universality class, that of the 
Wiener process. This approach has been fruitfully 
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applied to diffusion in disordered systems, the issue 
being to determine whether or not the disorder, 
accounted for as a noise term in the transition 
probabilities, modifies the normal diffusion law 
obtained in the unperturbed situation. Similar 
reasoning can also be implemented for self-similar 
anomalous diffusion processes, like fractional Brow- 
nian motions and Levy flights (Lesne 1998). 


Renormalization Analysis for Self-Avoiding Walks 


Let us only mention, for the sake of completeness, the 
renormalization techniques developed for determining 
the conformational statistics of linear polymer chains, 
whose three-dimensional shape can be represented as 
the trajectory of a self-avoiding random walk. These 
techniques belong to the RG corpus developed in 
statistical mechanics for critical phase transitions, 
within a field-theoretic framework. A formal but 
exact analogy can actually be worked out between 
self-avoiding walks and a spin lattice system with 
n — 0, where z is the number of spin components. 

The multiscale nature of the system is so marked 
here that it should rather be qualified as an absence of 
characteristic scale. In this respect, standard RG 
methods developed for critical phenomena lie at the 
very boundary of multiscale approaches. Scale decoup- 
ling is replaced by scale invariance, which is somehow 
the conjugate situation: homogeneity in real space is 
replaced by homogeneity in the conjugate space (space 
of characteristic scales). Scale invariance here reflects 
in the self-similar property, R(N) ~ N”, relating the 
end-to-end distance R of the chain to the number N of 
elementary steps (the monomers), with an anomalous 
exponent v (the Flory exponent v z 3/5 in dimension 
d — 3) originating from the infinite memory of the 
nonoverlapping chain. We refer to Lesne (1998) and 
references therein for a more detailed exposition of the 
concepts and techniques only alluded here. 


Effective Diffusion in a Porous Medium 
(Homogenization) 


Describing the diffusion in a porous medium appears 
as a formidable task at the pore level: it would 
require us to account for all the boundary conditions 
at the border of the hollow domain Y € Vo actually 
accessible to diffusion. When the pores have a finite 
characteristic size a, a homogenization approach can 
be developed at scales far larger than a. It allows to 
account for the slowing down of the motion due to 
obstacles in an effective diffusion coefficient (in plain 
words, the black and white medium made of matter 
and holes of size a appears as a grey homogeneous 
medium at larger scales). More specifically, a diffus- 
ing tracer of random trajectory r(t) experiences a 


varying coefficient D[r(t)] (it equals D inside the 
pores, whereas it vanishes in the nonaccessible region 
Yo — V). The idea is to replace this fluctuating 
realization of the transport coefficient by its spatial 
average (independent of the trajectory), in what 
concerns macroscopic properties: 


Dg = J D no(r) dr = i Djr] d^r 
Vo Jy 
(where mo(r) = 1 iff r € Y) [33] 


Rigorous mathematical theorems ensure that the 
large-scale motion can actually be described by a 
Fick law and associated plain diffusion equation 
(Bensoussan et al. 1978). 


Anomalous Diffusion in.a Fractal Medium 


The above homogenization for diffusion in a porous 
medium works well only if the pores have a finite 
characteristic size; by contrast, diffusion in a fractal 
substrate (e.g., a porous medium with pores of all 
sizes) generically leads to anomalous diffusion, asso- 
ciated with a time dependence of the mean-square 
displacement R?^(t) ~t with y< 1. In a fractal 
substrate, the existence of obstacles and pores of all 
sizes introduces spatial fluctuations at all scales and 
long-range correlations in the spatial dependence of D. 
This case corresponds to a critical situation and 
homogenization fails to give a relevant description of 
the macroscopic behavior, in the same way as mean- 
field methods fail to account for critical phase transi- 
tions. It reflects in the anomalous exponent y < 1 of the 
diffusion law, that can be related to the fractal 
characteristics of the substrate (y= d,/d;, where d, is 
the spectral dimension and dj the fractal dimension). 


Effective Diffusion in a Periodic Potential 
(Averaging Method) 


In case of a periodic medium, where D][r(t)] oscillates 
with a small spatial period, an averaging procedure 
can be developed as in the subsection “Effective 
diffusion in a porous medium (homogenization)," to 
determine an effective diffusion equation accounting 
for the large-scale motion. Explicit computations 
within a multiple-scale approach yield 


1 


Det = y 


[34] 
where (D) denotes a space average over the 
elementary cell (Givon et al. 2004). 

Let us rather detail the case of diffusion of a 
Brownian particle in a periodic potential U, with 
U(x + L)= U(x) for any x (restricting to dimension 
| for simplicity), at equilibrium at temperature T. 
Let D be the coefficient of this particle in the 


absence of the potential. At large scales dx > L, 
the substrate appears to be spatially uniform. The 
influence of the periodic bias exerted by the 
potential on the diffusive motion (superimposition 
of a modulated deterministic drift) can be described 
in an average way. The result is a normal diffusion 
with a reduced effective diffusion coefficient 


L 
Deg(U) —- D inf j 1 — f'(x)? dmu(x) 
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where the infimum is taken over the set of smooth 
periodic functions of period L and the average involves 
the equilibrium distribution my of the particle in the 
potential landscape U(.). So doing, one sees in 
particular that no oriented motion can arise at 
equilibrium, even if U is asymmetric. The procedure 
extends to dimension d with only technical differences. 


with dmy (x) = [35] 


Effective Diffusivity for a Passively 
Advected Scalar 


Still another fruitful implementation of multiple- 
scale method is encountered in the context of 
diffusion and transport phenomena, in the study of 
the advection by a given incompressible velocity 
field v(r,t) of a passive scalar field @(r,t), for 
example, the density of small inert “tracer” particles 
advected by the fluid flow without modifying it 
back. We consider the case when the fluid motion 
can be decomposed into a large-scale, slowly varying 
component and a small-scale, rapidly varying fluc- 
tuation: v(r,t) — U(r,t) + Au(r,t). The parameter A 
controls the relative strength of these components. 
Another small parameter A is involved in this 
problem: the ratio c —//L « 1 of the typical length 
scales L and l of U and u, respectively. Here the 
issue is to bridge two macroscopic descriptions: the 
full hydrodynamic equation describing the evolution 
of the scalar field A(r, t) 


Ó 
AP, 

Ot \ 
and a large-scale effective transport equation for an 
average scalar field 6; (r, t), 


t) + v(r,t). Vó(r, t) = DAÓO(r, t) [36] 


T t) + U(r, t). V6 (r, t) 


= 9 pef " 


=, D ar t)01 (r,t) [37] 


This procedure, amounting to account in an average 
way for the small-scale contributions to the 
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complete hydrodynamic description, relies on a 
spatio-temporal generalization of the multiple-scale 
method: it involves rescaled space and time vari- 
ables, X—ex,r—«et, T =e t The different charac- 
teristic scales of the velocity components are directly 
reflected in their arguments: u(x,t) and U(X, T). The 
passive scalar field now expresses 0(x, t, X,7, T) and 
it is expanded as 0 — 0? + c0! +? 6^. The standard 
multiple-scale procedure leads to introduce an 
auxiliary field y: 


Ox; + [(w4-AU).0]x; -DO^x; — —w; [38] 


yielding the effective diffusivity tensor (where () is a 
space average) 


D; pen et 2 =D 4 Op XiOp Xj) [39] 
p 


Advection enhances transport, and eddy diffusivity 
is larger than molecular diffusivity. In realistic cases, 
there is a continuum of scales i "Ty bins where 
4, has a characteristic scale |], ~ 2^"ly. Multiple- 
scale method is to be iterated into an RG analysis, 
achieving a recursive integration of the small and 
fast scales into DF starting from the smallest and 
fastest ones. 


Conclusions 


Multiscale approaches allow to predict large-scale 
behavior generated by a given model; even more, 
they offer constructive tools to bridge models at 
different scales for the same phenomenon. They 
provide systematic and mathematically well- 
controlled tools to turn faithful but intractable 
models into effective reduced ones, thus lying at 
the core of statistical mechanics, many-body dyna- 
mical systems, and, more generally, at all issues of 
the still-in-progress complex systems science. Indeed, 
in a complex system (that might be their very 
definition), levels are so interrelated that it is 
essential to investigate jointly all the scales, from 
elementary units up to the whole system, and its 
emergent properties; neither theoretical nor numer- 
ical approaches can alone consider all the levels 
together, showing the relevance, if not the necessity, 
of multiscale approaches. 

Basic preliminary issues are to determine the 
proper elementary level, the proper collective vari- 
ables, and the relevant small parameters. Let us 
remark that the implementation of a multiscale 
technique rapidly faces the fundamental issue of 
defining a macroscopic variable; it offers some clues, 
indicating that a macroscopic variable might be a 
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phenomenological quantity observable at our scale, 
a slow mode, or collective variable. 

Multiscale approaches take benefit of the separa- 
tion of scales involved in the different mechanisms 
at work in the phenomenon under consideration. 
The basic idea, seen above at work in various 
instances and different ways, is to somehow decou- 
ple the different scales and to solve several simpler 
single-scale problems. Any multiscale implementa- 
tion actually involves, at some stage and more or 
less explicitly, a limiting process in which the scale 
separation ratio 1/e tends to oo: this limiting process 
has to be carefully controlled in order that the 
method can be applied to real situation. Finally, to 
be successful, multiscale approaches should achieve 
a trade-off between: 


e accuracy (minimizing the loss of information 
involved in the reduction or projection technique), 

e efficiency and tractability (this is, e.g., one of the 
major successes of hydrodynamics) 

e robustness of the resulting reduced model (to be 
checked a posteriori), 

e flexibility (extending to heterogeneous systems 
involving different components), and 

e scope (bridging many different levels in order to 
capture the whole hierarchical structure). 


Let us conclude by emphasizing a much fruitful 
benefit of multiscale approaches: they allow to 
investigate structural stability of a model, in parti- 
cular to evidence relevant parameters and essential 
mechanisms controlling large-scale features. In this 
respect, they lead beyond the (necessarily restricted) 
scope of a specific model and give an explicit account 
of the observer biased view, related to its scale of 
observation. They hence contribute to capture a more 
complete and controlled understanding of the real 
physical systems. 

Finally, a note on bibliographic guide to multi- 
scale approaches may be useful. Technical details 
and several applications of multiscale perturba- 
tive expansions, in particular multiple-timescale 
method, with references to the original papers, 
can be found in Nayfeh (1973). Applications of 
multiple-scale method, fully worked out in a very 
pedagogical way, can be found in the work of 
Cukier and Deutsch (1969), Piasecki (1993), 
Bocquet (1997), and Mazzino et al. (2004). An 
acknowledged reference on homogenization tech- 
niques and multiscale analysis in periodic media 
is Bensoussan et al. (1978); see also the mono- 
graphs by Lochak and Meunier (1988) and 


Berdichersky et al. (1999). Two recent review 
papers on multiscale approaches and reduction 
techniques are Givon et al. (2004) and Gorban et al. 
(2004). Basic principles and technical aspects of 
scaling theories and RG approaches from a multiscale 
viewpoint can be found in Lesne (1998). 


See also: Adiabatic Piston; Averaging Methods; 
Bifurcations in Fluid Dynamics; Boltzmann Equation 
(Classical and Quantum); Central Manifolds, Normal 
Forms; Interacting Particle Systems and Hydrodynamic 
Equations; Korteweg-de Vries Equation and Other 
Modulation Equations; Localization for Quasiperiodic 
Potentials; Singularity and Bifurcation Theory; Stability 
Problems in Celestial Mechanics; Stationary Phase 
Approximation; Universality and Renormalization. 
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Introduction 


The concept of negative refraction has caused a 
revolution in classical optics and electromagnetic 
theory in the past few years (Pendry 2004, 
Ramakrishna 2005). If a material has negative 
dielectric permittivity (c) and negative magnetic 
permeability (uj) simultaneously at a given frequency 
w, then it can be said to have a negative refractive 
index defined as 


n= - VE 1] 


Several peculiar consequences of Maxwell’s equations 
for the propagation of radiation in such a material 
were originally pointed out by Veselago (1968). But 
the lack of such natural materials failed to create much 
enthusiasm until recently when composite structured 
photonic materials have been shown to have negative 
refractive index (Smith et al. 2000, Shelby et al. 2001). 

The question then boils down to what constitutes 
materials with negative e and ju? Where the structure 
varies spatially on a scale much less than the 
wavelength of the incident radiation, composite 
electromagnetic materials can be regarded effectively 
as homogeneous media. A set of effective response 
functions: the effective permittivity, &,4, and the 
effective permeability, jg, can then be ascribed to 
these materials. To develop a homogeneous view of 
the electromagnetic properties of a medium com- 
posed of discrete atoms and molecules was the 
motivation for defining a permittivity € and permea- 
bility u. The simplicity provided by such a descrip- 
tion cannot be understated. Provided the radiation 
cannot resolve the underlying structure, replicating 
the atoms of a material with structure on a larger 
scale therefore represents a straightforward exten- 
sion of the original concept. 


If we consider arrays of structures defined by a 
unit cell of dimensions, d, then our effective 
description of the response of the medium to 
electromagnetic radiation of angular frequency w 
will be valid provided that 


d « A= 2nc/w [2] 


This restriction ensures that the underlying structure 
of the medium will merely refract and not scatter the 
incident radiation, in which case an effective 
permittivity and permeability for the medium 
become valid. The above inequality defines 
the long wavelength or effective medium limit 
(Garland and Tanner 1978). Maxwell’s equations, 
written in the absence of free charges and external 
currents, 


OB 
OD 
. — | = ——— 4 
V-B=9, VxH 3r [4] 
together with the constitutive relations: 
B(w) = poper (w)H (w) [5] 
D(w) = soe.) E(w) j6] 


then provide us with a complete description of the 
electromagnetic properties of the material over the 
frequency range of interest. Note that the effective- 
medium parameters are a function of the frequency 
as the material polarization response depends on the 
time history of the applied fields (Landau et al. 
1984). These effective parameters were then general- 
ized to analytic complex functions to account for 
absorption, and to second-ranked tensors to describe 
anisotropic responses. 

The real parts of these effective material para- 
meters can always be negative; there is nothing 
fundamentally wrong about that. Provided that they 
are dispersive, that is, they vary as a function of 
frequency, and dissipative as a consequence of the 
famous Kramers-Kronig relations (Landau et al. 
1984), such materials are causally possible. Simulta- 
neously negative values of £; and p; change the 
nature of electromagnetic radiation in these media. 
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For example, the wave vector in such isotropic 
media points opposite to the Poynting vector and 
gives rise to many new interesting effects such as 
modified refraction, negative Doppler shifts, etc. 
Such materials can support a variety of surface 
electromagnetic modes, which can have dramatic 
effects such as the possibility of a perfect lens which 
has unlimited image resolution (Pendry 2000) and is 
not subject to the traditional diffraction limit. 

New artificial electromagnetic composite struc- 
tures, often referred to as *meta-materials," allow 
us to access values of these material parameters 
which are not found in naturally occurring materi- 
als. We will show here how to obtain negative 
values of & and jg in meta-materials using a 
variety of resonance phenomena. Then we will 
look at the problem of imaging with subdiffraction 
resolution using negative refractive index 
materials. 


Artificial Plasmas 


From the electromagnetic viewpoint, a plasma can 
be represented as a medium with dielectric permit- 
tivity whose real part is negative. The Coulomb 
force and the finite mass of the electrons combine to 
give an ideal plasma a dispersion in the relative 
permittivity, E(w), given by 


&(w) —1— [7] 


where the plasma frequency is defined by we = 
(pe”)/(Eqme), p is the number density of electrons, 
e is the electronic charge, and m, is the electron 
mass. The permittivity of the plasma is negative at 
frequencies below the plasma frequency. 

A plasma-like behavior characterizes the electron 
gas in the noble and alkali metals, with a plasma 
frequency typically at ultraviolet frequencies. 
Because of the presence of dissipation, at lower 
frequencies resistive effects dominate and the plas- 
mons cannot be excited. To obtain materials with 
negative dielectric permittivity at low frequencies, a 
lower plasma frequency is required corresponding to 
more massive particles and a lower particle density 
p. A structure consisting of a three-dimensional 
lattice of very thin wires simulates a low-density 
plasma of very heavy charged particles and is shown 
in Figure 1 (Pendry et al. 1998). A simple model 
allows us to describe the desired reduction in wp in 
such a structure. 

First consider a displacement of the electrons in 
the wires along one of the cubic axes. Only the wires 
directed along that axis are active and thus provide a 
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Figure 1 A periodic structure composed of infinite conducting 
wires arranged in a simple cubic lattice. Provided the factor a/d 
is small enough, the structure responds to incident electromag- 
netic waves as a plasma of very heavy charged particles. 
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lowered effective density of electrons, p.¢, given by 
the area occupied by the active wires. Thus, 


2 
Ta 
Pett = P F3 [8] 


An even more profound effect of constraining the 
electrons to run along thin wires is a result of the 
induced magnetic field which wraps the wires as 
the electrons are in motion. Suppose a current I 
flows in the wires. The magnetic field is 


= I  pave 
rR 2R 


where R is the distance from the wire center, v is the 
electron drift velocity, and pe is the charge density in 
the wire. In terms of the magnetic vector potential, 
the magnetic field is 


H(r) 9] 


H(R) = ug! V x A(R) (10 
where 
A(R) — Hot P in (da 11] 


and d is the lattice spacing. The importance of the 
divergence of the magnetic field with the wire radius 
as seen in eqn [9] is the contribution to the canonical 
electronic momentum given by eA. If we neglect the 
variation of the fields with distance from the wire 
center, we can view this contribution as defining a 
new effective mass for the electrons given by 


2 
LN = S—In(d/a) [12] 


Now the effective plasma frequency for the system 


FRE GNE LC ME 13) 


PO EQMeff i d? In(d/a) 
is seen to be much reduced. As an example, the 
plasma frequency of 141m aluminum wires paced 
by 10 mm is about 2 GHz, and the corresponding 
electronic effective mass is almost 15 times that of 
a proton! The factors of effective mass and charge 
density cancel leaving an expression comprising 
only the macroscopic system parameters. This is to 
be expected as a circuit analysis in terms of a 
capacitance and inductance can also be used to 
formulate the problem. However, such an 
approach can obscure the true nature of the 
problem which is encapsulated as a low-frequency 
plasma oscillation. Inclusion of the finite resistivity 
of the metal yields a finite lifetime for the plasmon 
excitation. Experiments have shown that a reduc- 
tion in the plasma frequency of six orders of 
magnitude from the ultraviolet to the microwave 


region can be achieved in these thin-wire compo- 
sites (Pendry et al. 1998). 


Artificial Magnetism 


Although the Maxwell equations [2]-[4] are sym- 
metric in the electric and magnetic fields, we are yet 
to discover a free magnetic pole. The magnetism we 
find in natural materials is limited to spin systems 
and restricts the values of jig. Up to microwave 
frequencies, magnetic activity is common and 
certain insulating ferromagnets and  antiferro- 
magnetic compounds such as MgF, and FeF; can 
even exhibit a negative permeability at some 
frequencies. However, large losses can accompany 
the magnetic activity in these materials. 

Recently, it has become clear that a wide variety 
of composite structures comprising resonant inclu- 
sions can display magnetic activity in the effective 
medium limit (Pendry et al. 1999). Efficient screen- 
ing of AC magnetic fields can be achieved using a 
thin cylindrical shell of metal or superconductor. In 
order to obtain a large magnetic response such that 
the modulus of the magnetic susceptibility, |x4| 1, 
what we require is a resonant over-screening 
material response. A collection of subwavelength- 
sized structures that exhibits such an over-screening 
response can constitute a negative i. material. 
One such resonant subwavelength structure is the 
so-called split-ring resonator (SRR), which can be 
scaled to form magnetic meta-materials from 
microwave to optical frequencies (Pendry et al. 
1999, O'Brien and Pendry 2002b). An SRR 
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Wavevector 


(a) (b) 
Figure 2 (a) The split-ring resonator structure. The structure is 
planar with an internal radius H. The metal rings are of width w 
and are separated by a spacing g. (b) Generic dispersion 
relationship, w vs. k, for a resonant structure with an isotropic 
effective permeability as in eqn [15]. 


structure which has been demonstrated experimen- 
taly to have a resonant magnetic response at 
microwave and THz frequencies is depicted in 
Figure 2a (Smith et al.). It comprises of two planar 
rings of metal on an insulating backing. The rings 
couple inductively to the magnetic field normal to 
the plane of the rings. Because of the large 
capacitance between the rings, the structure reso- 
nates at some frequency. Driven by the back 
electromotive force (emf), a large response is 
expected in the vicinity of the resonance frequency 
which is also antiphased in a small frequency range 
above the resonant frequency. If the SRRs are much 
smaller than the free-space wavelength, a collection 
of such SRRs would behave as a negative jg 
material at these frequencies. 

Theoretical calculations (Pendry et al. 1999) 
assuming a nondispersive metal show that a periodic 
lattice of such structures is characterized by a 
magnetic permeability given by 


! fur 
eff = 1 — —— 1 
fete = 1— —— TENY [14] 
where f =7R*/d? is the filling factor, 
2 
"-— 3lc 15] 


TR? In 2w/g 


is the resonant frequency, and the damping of the 
resonance is determined by the factor 
2l 


DR 16) 


Here d is the lattice spacing, R is the inner radius of 
the ring, w is the width of the rings, / is the distance 
between adjacent planes of SRRs, and o is the 
conductance per unit length of the rings measured 
along the circumference. Orientation of planar SRRs 
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Figure 3 (a) The generic magnetic response of the SRR 
structure. Re(u) < 0 in a frequency band above the resonance 
frequency. 


along all three Cartesian axes allows for the creation 
of an isotropic material. Figure 3 shows the generic 
dispersion of the p(w) given by eqn [14]. A higher 
resistivity for the material of the SRR would 
broaden the resonance and the frequency region 
with Re(u) « 0 might vanish altogether for large 
resistivity. 

For isotropic homogeneous materials with a 
resonant effective permeability as in eqn [14] we 
can illustrate a generic dispersion relationship, w vs. 
k, shown in Figure 2b. The solid lines represent 
twofold degenerate transverse modes and the 
dispersionless longitudinal magnetic plasmon 
mode at the magnetic plasmon frequency (wp). 
The dashed lines are a band of propagating states 
with a linear dispersion determined by the 
polarizability of the SRRs and a flat band of 
resonant states at the magnetic resonance fre- 
quency wo. The gap in the dispersion can be 
regarded as arising from the hybridization and 
avoided crossing of these bands. The important 
points to note are: 


1. Wherever jig is negative there is a gap in the 
dispersion relationship. This is the case for wo < 
W<Wmp, the frequency where pep =0. Only 
evanescent modes with imaginary wave vector 
exist in this region. 

2. A longitudinal magnetic plasma mode, which 
shows no dispersion, appears at w= Wp. 


An alternative approach to obtaining a nonzero 
magnetic susceptibility in composite media is pro- 
vided by the zeroth-order transverse electric (TE) 
Mie resonance in dielectric particles. Ferroelectric 
and phonon polaritonic materials are promising 
candidates for providing the necessary large dielec- 
tric constants up to infrared frequencies (O’Brien 
and Pendry 2002a). 


The high-frequency scaling properties of the SRR 
offer an interesting insight. The plasma-like dielec- 
tric permittivity of noble metals 


(w) = (€1,€2) ^ 17 
E(w) = (€1,€2) = Ex — ————— 

"ps wlw + iy) | 
is essentially a large negative real number for w, > 
w >> y. For a 2D array of simplified SRRs consisting 
of a single conducting ring with symmetrically 
placed small capacitive gaps, the quasistatic effective 
magnetic permeability for a magnetic field applied 
normal to the plane of the SRR is (O'Brien and 


Pendry 2002b) 


For 
w? — wo? + ilw 


where f'=Lef -(Le+ Li)", T -Lyy: L 4- Li) |, and 
uo? —(L, +L) C+. In the above expressions, 
Lg=powR* is the geometrical inductance per unit 
length of the structure and C=e9é.t/ncd. is the 
capacitance per unit length of the structure for series 
connection. Here it has been assumed that the 
thickness of the SRR (7) is small compared to the 
skin depth 6~co/wp. 

An additional inductive impedance in the struc- 
ture, the kinetic or inertial inductance, L; = 
2nR/&ow?T = 2uon R6 /T, determines the effective 
filling fraction and damping of the resonance through 
the ratio of the two contributions to the total 
inductance. This contribution to the inductance arises 
from the finite electron mass and implies that simply 
decreasing the size of the resonators indefinitely will 
not result in our being able to realize a strong 
magnetic response at near-infrared or optical fre- 
quencies. As the dimensions of the structure are 
reduced that fraction of the energy of the displace- 
ment current associated with the inertial mass of the 
electrons increases. A finite y then means that 
dissipative losses increase. Thus, strong damping of 
the resonance will be avoided if the quantity Rr/26* 
is large. We note here that with ó equal to the 
London penetration depth, this ratio also determines 
the screening efficiency of low-frequency magnetic 
fields by a thin layer of superconductor. This result 
points to a broader similarity between the low- 
frequency electromagnetic properties of the super- 
conducting condensate and those of a perfect plasma. 

Other nanocomposites in addition to the SRR 
have been proposed which may lead to a magnetic 
response at optical frequencies. These include pairs 
of nanometer-sized metallic sticks where simulta- 
neous electric and magnetic dipole resonances lead 
to a strongly dispersive effective permittivity and 
permeability. 


leg = 1 — [18] 


Negative Refractive Index Media 


Interleaving the structures for a negative cog and jug 
can create a composite with eg < 0 and peg < 0 at a 
common frequency (w) (Smith et al., Shelby et al. 
2001), which as predicted by Veselago (1968) should 
give rise to a material with negative refractive index. 
Although this appears intuitively correct, it is actually 
nontrivial that the electromagnetic fields of the two 
composites do not interfere with each other's function 
(Pokrovsky and Efros 2002) and this could depend 
crucially on the relative placement of the two 
structures (Marques and Smith 2004). However, 
there is now overwhelming experimental and numer- 
ical evidence that such composite structures possess 
negative refractive index (see Ramakrishna (2005, 
section 6)). Now consider a medium with predomi- 
nantly real € and u. For £ > 0 and p > 0, we have 
our usual optical materials. Only one of & or pu lesser 
than zero with the other positive would imply a 
medium which cannot support any propagating 
modes. This is a consequence of Maxwell's equations: 


k- k = &(w)u(w) > [19] 


which implies that only evanescently decaying waves 
with an imaginary component of k are possible. 
Common examples are ordinary metals with £ < 0 
and u > 0. Now consider a medium with both £ < 0 
and u < 0, or a negative refractive index medium. 
The Maxwell's equations for a plane time-harmonic 
wave expli (k -r — wt)] are: 


ExB- = (we) H [20] 


kx H = - T e(w)E [21] 


The “left-handedness” of the triad (E, H, k) is clear 
from these equations for &(w), u(w) « O0. A real 
refractive index means that waves propagate with the 
direction of energy flow given by the Poynting vector, 


a= he [22] 


opposite to the direction of the wave vector. Since 
the group velocity is in the direction of the energy 
flow, we conclude that in these left-handed materials 
(LHMs) the group velocity and the phase velocity 
are oppositely directed. The phase accumulated in 
propagating a distance x is Ad = — J/Ejw/cox. Thus, 
the refractive index can be taken to be n= —,/ep, 
that is, a negative quantity. Mathematically, it is 
more reasonable to ask for the sign of the square- 
root to determine the wave vector given by eqn [19]. 
It can be shown by arguments of analytic continuity 
in the complex plane that the negative sign has to be 
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VAC RHM 


VAC LHM 


(a) (b) 
Figure 4 Illustration of Snell's law at an interface between 
two media with (a) positive refractive index (VAC/RHM) and 
(b) negative refractive index (VAC/LHM). The arrows indicate the 
wave vectors and the energy flow is opposite to the wave vector 
in the negative index medium. 


chosen for propagating waves when Re(c) < 0 and 
Re(u) « 0 (Ramakrishna 2005). 

The negative refractive index has real effects on 
the behavior of radiation even in basic processes 
such as refraction. Consider an interface between 
vacuum and a negative refractive index medium 
with z « 0 shown in Figure 4. Continuity conditions 
on the electromagnetic fields at the interface require 
for a plane wave incident from the vacuum side at 
an oblique angle that the parallel wave vector ky is 
conserved for the transmitted and reflected wave. 
This is the origin of Snell's law: 


sin(0;) = sin(0,) = m_ sin(0,) [23] 


where ĝi, 0, and 06, are the angles of incidence, 
reflection, and transmission, respectively. The flow 
of energy across the interface determines the direc- 
tion of the group velocity in the material medium as 
being away from the interface. Therefore, the 
component of the phase velocity vector normal to 
the interface must change sign as we pass from 
vacuum into the material medium. We are then 
forced to conclude that the ray is bent toward the 
same side of the surface normal as the incident 
wave. This picture is consistent with Snell’s law with 
the interpretation that 2 « 0 — 0, «0. Figure 4 illus- 
trates this point which has been experimentally 
verified by several groups (Shelby et al. 2001, 
Parazzoli et al. 2003, Eleftheriades et al. 2002). 

As a direct consequence of this, it is seen that a 
flat slab of negative refractive medium can act as a 
lens as shown in Figure 5. Provided that the slab is 
of sufficient thickness, the refracted rays from a 
point source come to a focus inside the slab and 
upon exiting the slab the rays are redirected again 
such that they come to a focus on the opposite side 
of the slab (Veselago 1968). Veselago also predicted 
a negative Doppler shift in such media and an 
obtuse angle cone for Cerenkov radiation. 
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Figure 5 Steady-state passage of rays (representing the energy 
flow) of light from vacuum through a slab made of a LHM with 
n= — 1. The slab acts as a lens mapping a point on the image plane 
to a point on the object plane. 


Perfect Lens: Subwavelength Imaging 


A wave analysis of the Veselago lens revealed an 
extremely novel aspect: it did not suffer from the 
diffraction limit and the image resolution could be 
infinite (Pendry 2000), if the negative index 
material were perfectly nondispersive and nonab- 
sorbing. Before we analyze this, let us first briefly 
review the problem of imaging and the diffraction 
limit. 

Any object is visible because it emits or scatters 
light. The problem of imaging is then concerned 
with reproducing the electromagnetic field distribu- 
tion on a 2D object plane in the 2D image plane. If 
E(x, y,0) be the electric field on the object (z — 0) 
plane, the fields in free space can be decomposed 
into the Fourier components b, and k,, and 
polarization defined by o: 


E(x, y, 2;t) = y E, (Fenky) 
o,kx,ky 


x expli (kxx + kyy + kzz — wt) | [24] 


where 


Es(kx, Ry) = / E,(x,y,0) e ^b» dedy [25] 
xy 

In the above expression, the source is assumed to be 
monochromatic of frequency w, k3 +k? + ki = 
w^[ce, co is the speed of light in free space, and 
z is the optical axis. A conventional lens acts by 
applying a phase correction to each of the propaga- 
ting components so that they reassemble to a focus 
at a point beyond the lens. For these components k, 
is real, thus a phase change is all that is required to 
form an image containing these components. The 
higher spatial details in an object, however, are 
described by the nonpropagating near-field compo- 
nents with an imaginary k, where k? + ks > u/c. 
A conventional lens cannot restore these 


components in the image plane as they decay 
exponentially in amplitude as one moves away 
from the source. Hence the resolution, A, provided 
by a conventional lens is limited to those compo- 
nents with 


2r 
k +k? wd aA EEM [26] 
Now consider the slab of medium with €= —1 
and y= —1 and of thickness d,. It can be shown 


(Pendry 2000) that the transmission and reflection 
coefficients are 


lim ¢ = exp|—i&.d,| [27] 


lim? — 0 [28] 


respectively, where k, is the component of the wave 
vector normal to the interface. Thus, the slab 
reverses the phase advance for the propagating 
waves as revealed by the ray picture. Analytic 
continuation to imaginary wave vectors k,=ik; 
implies that the transmittance ? — exp(^-&,d), that 
is, the slab also increases the amplitude of the 
evanescent waves in transmission at exactly the 
same rate as the rate of the decay in free space 
outside. Thus, each wave, propagating or evanes- 
cent, arrives at the image plane with its phase or 
amplitude restored exactly to the values at the object 
plane so as to perfectly reconstruct the image. The 
lens is also perfectly impedance matched and has 
zero reflection. These incredible properties have led 
the phenomenon to be called “perfect lensing.” 
Note that there is no energy flux associated with 
purely evanescent waves, and hence the amplifica- 
tion obtained in the steady state corresponds to local 
field enhancements which would imply the presence 
of localized resonances. In fact, the entire mechan- 
ism of the focusing of the near-field components is 
due to surface modes that reside on the surfaces of 
these negative index materials (Ramakrishna 2005). 
€ — —] and p= —1 are precisely the conditions for 
these surface modes of electric and magnetic nature, 
respectively. These surface plasmon resonances 
which are excited resonantly by the evanescent 
modes and the secret to the perfect lens is that all 
the surface modes are completely degenerate. 
Although the conditions for realizing a perfect lens 
are easy to specify, in practice these are very difficult to 
meet. The requirement of negative values for £ and y 
implies that these quantities must disperse necessarily 
with frequency and be dissipative. Thus, the perfect- 
lens condition can only be met approximately at a 
single frequency. Any deviation from the ideal 


conditions can then result in the excitation of slab 
polariton resonances which can swamp the image. The 
effects of absorption, which are always present, can 
also seriously degrade the lens performance by damp- 
ing out the surface plasmon resonances (Ramakrishna 
2005). Consider the transmission for the P-polarized 
radiation through a negative index slab: 


i.) E 


29) 
where 
D= (kz /E4 5 kae y > (Rey /E4 — kaje J eah 


Under the perfect-lens conditions, the first term in 
the denominator goes to zero for evanescent waves 
and the exponential in the second term decays faster 
than the exponential in the numerator. However, if 
there was a mismatch in the conditions, (£4 = 1 and 
£_ = —] + 6, say) then the first term in the denomi- 
nator no longer vanishes. In the large wave vector 
limit (ky >> w/co), the two terms in the denominator 
become approximately equal when 


[30] 


thus yielding a criterion for the largest wave vector 
for which there is effective amplification. The 
dependence through the logarithm on the deviations 
(whether real or imaginary) from the resonant 
conditions underlines the fact that the perfect lens 
effect is indeed very sensitive. In practice, the 
periodicity, d, of the strucuture of the meta- 
materials comprising the negative index slab itself 
imposes an upper wave vector cutoff b. =27/d. The 
material will become spatially dispersive for wave 
vectors k — ke, and for k>k, the very description as 
a homogeneous material will break down. 

An important simplification of the perfect-lens 
conditions results when we consider a situation in 
which all length scales in the problem are much less 
than the wavelength of the light (the quasistatic 
approximation). Under these conditions, the electric 
and magnetic fields effectively decouple. If we 
consider the case of P-polarized fields, it can be 
shown (Pendry 2000) that in the quasistatic limit 
only the value of the permittivity is important, and 
there are essentially no conditions on the value of 
the permeability. This brings metals such as silver 
into the picture as the permittivity of silver becomes 
equal to —1 in the optical region of the spectrum 
and with relatively small losses (Pendry 2000). To 
overcome the losses, a series of refinements of the 
simple thin-slab picture have been proposed includ- 
ing dividing the lens into a series of layers and using 
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optical amplification to act against the deleterious 
effects of absorption (Ramakrishna 2005). 


The Generalized Perfect-Lens Theorem 


The negative refractive slab can be considered as 
“optical antimatter” in the sense that it cancels out the 
effects on radiation of the traversal through an equal 
amount of positive refractive index medium. This 
cancelation is applicable to the phase changes for the 
propagating modes and the amplitude changes to the 
evanescent modes. In fact, the focussing action can 
happen for more general situations where the require- 
ment of homogeneity of the slab material can be 
relaxed. Now consider the more general situation 
where the dielectric permittivity and the magnetic 
permeability are arbitrary functions of the spatial 
coordinates: 


E+ = E(x, y), H+ = u(x, y) [31] 


e ——e(xy,. we =-n(x,y) [82 


corresponding to the Figure 6. We will consider the 
imaging axis to be the z-axis. Thus, we see that the 
system is antisymmetric with respect to the z=d 
plane. It turns out (Pendry and Ramakrishna 2003) 
that such a system also transfers the image of a 
source placed at the z=0 to the z=2d plane in the 
same exact sense that it includes both the propagat- 
ing and evanescent components. In general, the rays 
in spatially varying media will not be straight lines 
as shown in Figure 6, but the effect of propagating 
through the positive medium is nullified by the 
negative medium. Thus, to an observer on the right- 
hand side, it would appear as if the region between 
z=0 and z=2d did not exist. We will call such 
media with the same sense of transverse spatial 
variation but with opposite signs as optical com- 
plementary media, and the effect of any such pairs 
of complementary media on radiation is null. 


Figure 6 A pair of complementary optical media nullify the 
effect of each other for the passage of light. Spatially varying 
positive and negative refractive indices are schematically depicted 
by the white or shaded regions. 
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The most general conditions on the permittivity 
and permeability tensors for such complementary 
behavior are: 


Exx Exy €Exz 


= | S yp oe 
Ezx Egy Ezz | 33 | 
Mxx Pxy Hxz 
H+ = | Myx Hyy Myz 
Hzx Pz Pez 
and 
—Exx TT Exy TE xz 
E- = | —Eyx —Eyy tex 
TEx bEzy —Ezz 34) 
—hxx —Hxy +Hxz 
p- = | Myx  —Hyy yz 
Tax  ctHzy —Hezz 


and a perfect focus results whenever the two slabs of 
positive and negative media have such a behavior (see 
Pendry and Ramakrishna (2003) and Ramakrishna 
(2005) for the proof). This theorem clearly shows that 
the dependence along the x- and y-directions trans- 
verse to the imaging axis z is completely irrelevant as 
long as the two slabs are optically complementary. As 
an extension, it can be shown that any system of 
opticaly complementary media will also have a 
perfect focus as long as the system has a plane of 
antisymmetry normal to the optical axis. The above 
effects have also been numerically verified for several 
such spatially varying complementary media (Pendry 
and Ramakrishna 2003). 


Perfect Lens in Other Geometries 


The above generalized perfect-lens theorem along with 
a method of coordinate transformations can enable us 
to now generate a variety of superlenses in different 
geometries. In general, if we can find a geometric 
transformation that maps a given configuration into 
the geometry for the generalized slab lens, then we 
would have generated one more arrangement that will 
exhibit the property of transferring images of sources 
in a perfect sense. If we define the new coordinates 
qi(x, y, z), q2(x, y, z), and qs(x, y, z) (assumed ortho- 
gonal), then in the new frame, the material parameters 
and fields are given by (Ward and Pendry 1996) 

l 1 


where 


ax Y dy V Oz 3 
Qi = (55. ut : F3 u 


Note that a distortion of space results in the change 
of e and yp tensors in general. Thus, in many cases, 
the transformed geometry would involve spatially 
varying (inhomogeneous) and anisotropic medium 
parameters. 

The change in geometry can also make it possible 
for us to realize lenses with curved surfaces. The 
original slab lens maps every point on the object plane 
to another point on the image plane. But the size of 
the image is identical to that of the source. This is due 
to the invariance in the transverse direction and the 
transverse wave vector '(k,, k,) is preserved. In 
general, to change the size of the images, the 
translational symmetry would have to be broken and 
curved surfaces will necessarily be needed. The 
focussing action for the evanescent waves is crucially 
dependent on the near degeneracy of the surface 
plasmons in the case of the slab, and curved surfaces, 
in general, have a completely different dispersion for 
the surface plasmons. Thus, one should expect that 
inhomogeneous materials will be required for such 
curved lenses of negative refractive index. It can be 
shown (Ramakrishna 2005) that mapping the slab 
lens into cylindrical coordinates 


Uf Lo 


x = roe P coso, y — roe"" sing, z-—Z [38] 


where fọ is some scale factor(— 1) generates a 
cylindrical annulus of inner and outer radi a; 
and a, respectively, with the material parameters 
given by 


E, = py = =] 
Eo = Ho = —1 [39] 
Eg — Hg — -1/r 


for the annular region. The positive material outside 
the annular region should vary as 


Ey = p, = +1 
E$ = pij = +1 [40] 
Eg = pg = +r 


where r= ro exp(//£o). This system transfers images 
in and out of the cylindrical annulus and the image 
of a source inside at r—ao will be formed on the 
surface 43 — a(az/a1)*. Thus, there will be a 
magnification of the image by the factor 


M = (2) [41] 


Note that these cylindrical lenses are also short- 
sighted in the same manner as the slab lens. They 
can only focus sources from inside to the outside 
only when ai/a) «r4, and the other way 
around from outside to the inner world when the 
source is located in a» < r < a5/a1. 

Similarly the transformation into spherical coor- 
dinates (r —r9e'/^,0, 0) can be used to generate a 
spherical perfect lens wherein a spherical shell of 
negative refractive material with e(r) ~ —1/r and 
u(r) ^ —1/r with arbitrary dependence along 0 and à 
(which could be constant too!) have the property of 
perfectly transferring images of sources in and out of 
the shell (Pendry and Ramakrishna 2003). This 
spherical lens also has exactly the same magnifica- 
tion factor given by eqn [41]. In fact, the solutions in 
these two cases of a cylinder and sphere can also be 
obtained by a more conventional electromagnetic 
calculation in terms of the scattering modes 
(Ramakrishna 2005). One can obtain even more 
esoteric configurations such as one or two intersect- 
ing corners of negative refracting materials that 


behave as perfect lenses (Pendry and Ramakrishna 
2003). 


Other Approaches to Negative Refraction 


There is also an approach to negative refractive 
materials based on loaded transmission lines 
(Eleftheriades et al. 2002), which has been imple- 
mented at radio- and microwave frequencies using 
lumped circuit elements. These show all the hall- 
marks of a negative refractive material within an 
effective medium approach. 

Effects which can be interpreted as negative 
refraction have been observed in certain periodic 
photonic crystals (PCs) (Luo et al. 2003). An 
incident propagating plane wave from vacuum 
appears to undergo negative refraction inside the 
PC, and a slab of the PC can even work as a 
Veselago lens. The negative refraction in this case is 
a result of the curvature of the equifrequency surface 
and is present in spite of the right-handed nature of 
the propagation. In these instances, an effective 
permittivity and permeability cannot be easily 
ascribed to the crystal as the long wavelength 
condition is not met. It is difficult to homogenize 
the PC in the sense of meta-materials, and the 
energy transport in these PCs is very sensitive to the 
periodicity and the structural arrangements. Thus, it 
would be an over-simplification to characterize these 
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effects in PC as merely due to an effective refractive 
index. 
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Introduction 


Thermohydraulics is based on the hypothesis of 
continuous medium. This hypothesis is easily satis- 
fied since, for instance, a one-thousandth of 1 mm? 
of a perfect gas at normal temperature and pressure 
conditions (300K, 1 atm) contains about 2.5 x 10? 
molecules. Instantaneous balances are made inside a 
control volume fixed in the system of axes and 
crossed by the flows. The limit where this volume 
vanishes leads to the local formulation of the laws 
governing the flows. The flow is described by 
velocity v(r,t), pressure p(f,t), temperature T(r, t), 
and other fields, f being the position vector of a 
point M, and ¢ the time. The material derivative of 


q(f, t) is 
Dq (90 we 
pis (5; 9-9)a 


Let O (Q) be one of the scalar (vectorial) extensive 
quantities whose balance participates in the flow 
dynamics. It can be a quantity of matter, heat, 
impulse, or something else. Let AQ be the amount of 
O contained in the volume AY localized around M, 
and q(f, t) its local representative defined by 

AO dO 
= lim “> = 1 

avo AV dV [1] 
where p is the density, similarly defined considering 
the case where [O] is taken as the mass m: 


pr. t)q(r. t) 


|. dm 


p(r.t) = dv n 


Table 1 gives examples of g quantities. 
The instantaneous local balance of O reads 


ð ~ f> E. 
E" (pq) +V- (io T qi) = So [3] 


where So stands for any possible local source of Q, 
and jọ is the O conduction flux density. Figure 1 


Table 1 Some quantities g. 7 is the absolute temperature, C; 
the specific heat at constant pressure, and C the solute mass 
fraction 


Mass Impulse Kinetic energy Heat Mass fraction 


-2 


1 V a GT  0«O«1 


Figure 1 Q flux density and Q flux. 


Table 2 Physical dimension of fluxes, flux densities, and 


~ 


V. (flux density) for some q quantities 


Q q Flux Flux density V-(flux density) 
Volume undefined m?s^! [velocity] p 
Mass 1 kgs! kgs'm?  kgs'm? 
Energy, [velocity]? W, Wm? Wm? 

heat 
Electrical Coulomb kg! A Am ? Am? 

charge 
Impulse [velocity] [force] [pressure] [pressure] m” 


illustrates how these quantities allow us to evaluate the 
flux do =jo -dS of O that instantaneously crosses 
a surface dS. Table 2 gathers the physical dimension 
of these notions for various O's. 

For Q, the flux densities are second-order tensors, 
since db =7/# -dS is vectorial (Figure 1). Its 

IRA 

balance reads 


! _st " 

aov (jotea) =3 M 
where ' indicates the transposition and & a dyadic 
product. jg and jo are given later. 

The governing equations of thermohydraulics are 
like [3] and [4]. They are completed by compatible 
initial and boundary conditions. The most general 
linear expression of the latter ones is of mixed type, 
for a scalar field, 


aq + a(v 2L =y onthe boundary [S] 


a, 3, and y being prescribed data, and £t the outward 
normal to the boundary. For a vectorial field, g and 
^j, respectively, replace q and y. The simplest cases 
are Dirichlet and Neumann boundary conditions 
with, respectively, 8=0 or a — 0. 


Governing Equations 


We consider nonisothermal flows of fluids in thermo- 
dynamic conditions far from the critical point where 
acoustic effects are involved. The fluid is possibly a 
binary mixture, the simplest non-pure-fluid case where 
modeling does not raise conceptual difficulties. The 


local composition is described by the solute (say) mass 
fraction, 


C(M, t) = hm AMsolute —. Psolute 
Av-0 Am p 


with 0< C € 1. Only thermodiffusion is treated, 
and the influence the solutal gradient has on the heat 
flux is not considered, being negligible in liquid 
mixtures. The coupling between the heat and species 
molecular transports then comes only in the solutal 
flux density relation 


— 


ee T5 IVC +C(1—C)SrVT] [6] 
with «c0, and ST(T,C), the solute Soret 
coefficient, which is positive or negative. The 
order of magnitude of the Soret coefficient in the 
molecular solutions does not exceed few 102 K+, 
while for colloidal solutions (ferrofluids) |$7| can 
be in the range 0.03-0.5K !. Even if small, the 
induced mass fraction separation, AC œ STAT, 
generates a solutal buoyancy of significant dyna- 
mical influence. 


Equation of State for the Density 


One must first describe the sensitivity of the density, 
p(p,T,C), upon pressure, temperature, and mass 
fraction in static conditions. The pressure and 
temperature effective ranges, Ap and AT, are 
assumed small enough compared to their respective 
mean values, po and To, for the local (at 
po = p(po, To, Co)) tangent to p(p,T,C) to be a 
good approximation in most cases, 


P—P — x(p - po) - ar(T — To)  ac(C - Co) [7 
where 
|] ap) | 
X = po E 0 
and 
S =H) 
© po\8T} |p) po C) | 
are the compressibility, thermal, and solutal 


expansion positive coefficients, and Cp is the solute 
mean mass fraction. Thermodynamic properties of 
some fluids are given in Table 3. Equation [7] is 
valid if x Ap, arAT, and ac|AC| are <1. More- 
over, in laboratory experiments and industrial 
processes, one generally has Ap/po « AT/To. The 
pressure term in [7] can thus be neglected in 
thermohydraulics. 
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Table 3 Some values of density, thermal expansion and 
compressibility coefficients, specific heat at constant pressure, 
and sound speed at p = 1 atm and T — 293K; in SI units 


Fluid p aT xp Cp C 


Air 1.205 1 1 1005 344 
Helium 0.167 1 1 5227 1010 
CO; 1.841 1 1 832 269 
Water 1000 0.0607 4.91 x 10? 4182 1461 


2.2 x 10° 2333 2044 
3.76 x 1075 1391 1409 


Glycerol 1250 
Mercury 13579 


Notice that water density exhibits a maximum 
around 4^C. A quadratic term in T must then be 


added to [7]. 


The Boussinesq Approximations 


The parameter oT AT < 1 is the primary source of 
thermohydraulics. Therefore, the v, p, T and C fields 
can be expanded in series of terms of increasing 
power in «rAT. The leading term of each series 
contains an important part of the interesting 
dynamics. The forthcoming equations are given in 
the corresponding approximation framework. They 
contain many simplifications, due to Boussinesq. For 
instance, the conductivities and diffusivities are 
taken as constant, as well as C(1 — C)Sr in eqn [6]. 
The next approximation step, the low-Mach model, 
keeps the leading compressibility and expansion 
effects, while discarding the associated acoustic 
waves. This gives access to thermo-soluto-acoustic 
phenomena. Expansion oscillations are indeed able 
to trigger, and sustain, acoustic waves provided 
phase agreements are fulfilled. This second-order 
model is not presented here. 

The compliance with the criteria ar AT <1 and 
ac|AC| <1 must be checked case by case. The 
section “Steady parallel-flow model” briefly illus- 
trates this point with an example of thermally driven 
flow. Furthermore, the T- and C-sensitivity of Sr is 
an experimental fact that requires a generic 
approach of the problem. The C-sensitivity of the 
physical properties is generally more pronounced, 
nonmonotonic, for instance, over C € [0,1], than 
their T-sensitivity. 


Boussinesq Local Balances 


Mass |t reads Op[ Ot + V - (pU) — 0, or equivalently 
(1/p)(Dp/Dt)=—V-v. The fluid particle density 
varies along its trajectory by compressibility and 
thermo-solutal expansion. At the leading order in 
aTAT and ac|AC\, the latter is negligible, whereas 
the former is associated with acoustics effects, also 
negligible when the fluid velocity is much smaller 
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than the sound speed. The mass balance equation 
then reduces to 


— 


V-5=0 [8] 


Only transverse velocity waves (or shear waves) are 
allowed by this equation, v ~ e'*7*«? with k- v — 0, 
since acoustics contributions are discarded. 


Impulse The impulse molecular flux density is 
jg-bi-u|Vesc(Ves 


where jig is the impulse conductivity and 1 the 
Kronecker tensor. A Newtonian fluid is defined as 
having py constant with respect to the rate-of- 
strain tensor V@v. The impulse balance then 
reads 


In the source term pl, =g for gravity-driven 
buoyant flows. 

With the aforementioned approximations, the 
impulse balance becomes 


Dv legs, PP... A. 
T eV Par oU d 9 
Dt po po” d 
with 
E T uer - A tail) 

PO 

| bo 

po 


the impulse diffusivity, and the pressure P =p — po. p, 
Po,» Satisfying the hydrostatic relation 


Vpo, -5 pog 
In the rotating frame of vector Q(t), 


P — Po 
po 


BA GAF 205i rz 


must be subtracted from the right-hand side of [9] 
and po., redefined by 


Vio.» = po(& - à ^ (^ v) 


On a free surface, a particular velocity boundary 
condition is to be established. Let d$— dS 54 be a 


surface element located around M. The tangential 
component (£ - £2 — 0) of the impulse flux across d$, 


— 


t-df =2- ju dŠ = —ust- IVese(Ves . d$ 


must be continuous. Surface tension o(T, C) inho- 
mogeneities make the free surface a source of 
impulse which diffuses in the fluid core. A flow 
occurs even with I — 0. For the fluid located where 
dS points to, the velocity boundary condition on the 
free surface then reads 


= pe Ies (Veg) -A—(V-i) [10] 


with 
> a Oo l= . Oo : 
(V-t) =a VW 9T +50 t)C 


For most fluids, 9c/OT <0. In the Boussinesq 
framework 0c/OT and 0c/O0C are constant. Equa- 
tion [10] couples the impulse balance with the heat 
and composition ones. 


Heat Local thermodynamic equilibrium is 
assumed. The molecular heat flux density is 
jaa —urVT, with ur the thermal conductivity. 
The approximate heat balance reads 

T - kV T + Syeat [1 1] 
where Kr=r/(poC,) is the heat diffusivity and 
Shear a possible local (Joule, radioactive,...) heat 
source. Thermohydraulics can simply be driven by 
nonuniform thermal conditions imposed along the 
fluid boundary, and in this article we henceforth 
take Spear = 0. 


Mass fraction Approximating [6] yields the mass 
fraction balance, 

DC 22 22 
Tu =KcV C+ Co(1— Co)STV T [12] 


where Kc and Sr are evaluated at To and Co. The 
normal flux condition 


(Vc ! i) = ht Co)Sr(WT | i) 


is imposed on impervious boundaries. 


The Hydrostatic State 


Knowing whether the fluid can be in static state 
with respect to its presupposed rigid container helps 
for a first understanding of thermohydraulic 
dynamics. This raises two problems: (1) the exis- 
tence of this state and (2) its stability, discussed 


later. Point (1) requires the fulfilment of three 
relations, 


Vp = p(p, T, CT [13] 
Tr = "Td T 
aC 2 44] 
En — &cV C+ Co(1 — Co)STV T 


The curl of [13] yields 
Vo(p, T, C) AT + p(p, T, CIV AT —0 


which has no reason to be generically satisfied since 
p(p,T,C) and D are totally uncorrelated. The 
hydrostatic state cannot exist if I does not derive 
from a scalar potential, as with 


E 


3 w 2». .— da. UR 
D-g-QA^(QAr)—-- AT if gT? 
The Earth’s rotation axis is known to precess with a 
period of about 26000 years. This generates a 
component of 26000 years timescale in the atmo- 
spheric, oceanic, and internal flows. 

Considering now that 


the existence of a hydrostatic state only depends on 
the simultaneous verification of [14] and 


Vo(p, T, C) AV — 0 [15] 


Iso- surfaces must therefore coincide with iso- 
pycnal, isobaric, iso- T, and iso-C surfaces since the 
p, T, and C sensitivities of p are uncorrelated. The 
compatibility of this condition with [14] is the key 
for concluding about the existence of the hydro- 
static state. Considering again our planet as an 
example (forgetting about precession), the iso- 
surfaces are almost ellipsoidal. Such T and C 
distributions cannot satisfy [14]. Thus, the atmo- 
spheric and oceanic dynamics, and thermohydrau- 
lics as well, are due to a nonvanishing thermal 
torque, VT A Vw. 

A free surface in hydrostatic state is isothermal 
and isocompositional, by eqn [10], whatever I. 


Dimensionless Local Balances 


In buoyancy-driven thermohydraulics, we consider 
four velocity scales — three of molecular origin, and 
the fourth is the free-fall velocity in the buoyancy, 
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Table 4 Orders of magnitude of the Prandtl number for the 
usual fluids. Air and water are in normal conditions 


Liquid metals Gases Water Oils 
Several 10? — 107? ^1, 0.7 for air 6.7 >10 
KT KC V 
V ==., Vra., V2—— 


V4 = vVarATsgL 


L being a fluid container size scale. Thence come the 
Rayleigh, Prandtl and Lewis numbers, 


Vi gL? 
Ra = Vi V; = oq A -— 
"URP 
Vi KT Vi KT 
Ra being the experimental control parameter, and 
Le < 1. Table 4 gives Pr orders of magnitude for 
usual fluids. Let V be the fluid velocity amplitude. 
The importance of the thermal, solutal, and impulse 
convections with respect to the corresponding 
diffusions is, respectively, estimated by the thermal, 
compositional Péclet and Reynolds numbers, 


LEA pp V VL 
tO Vi Kr’ C V kc 
V VL 
unt a 
with 
Per Pec 
P = Re’ e "Bat Rä = (PerRe), ,, 


Capillary thermohydraulics introduces one velo- 
city scale and the Marangoni number, 


.. |Ao| 
Hg | Vi 

with Ac-—(do/dT)AT in pure fluid. A small 

capillary number, Ca=|Ao|/o, indicates a weak 

influence of the dynamics upon the free-surface 


curvature. 
Let Vi, II = po V3, T= VAT and 


AC = —Co(1 — Co)SrâT 


Vs 


be the velocity, pressure, time, temperature, and 
mass fraction scales, with 


TOR do and C= 


AT AC 


the reduced temperature and mass fraction, respec- 
tively. The other quantities, coordinates included, 
are similarly reduced and noted identically. 
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Equation [8] does not change and [9], [11] and [12] 
become, respectively, 


Dv - 42 
TU =—-VP+ Pr Ra(O + VyC)é, 4- V j 16] 
Dt 
DO 2 
=y 9 17 
aY [17] 
DC 25. gw 
— = -V € 1 
Di eV C is |18] 
where 
u | acAC 
OTAT 
is the buoyancy separation ratio and e, = —g/|g\. 


A Ypg « 0 (50) corresponds to opposite (coopera- 
tive) thermal and solutal buoyancies. The reduced 
mass fraction boundary condition on impervious 
walls is 


(Vc i) = (Ve. 4) 19] 
In rotating frame, scaling Q(t) by Qo, Q(t) = Q(t) /Qo, 


Ra Fr (O + V&C)O ^ (QAF) 


must be added inside the square-bracket term of 
[16]. The Froude and Ekman numbers appear as 
QSL V 
Fr = ——. Ek =—— 
dui" QL? 


The dimensionless capillarity stress condition [10] 
reads 


b Vos (Voy. 


=-Ma((V-#)0+Uc(V-#)C) [20] 


with 
_ 00/AC AC 
© 00/O0T AT 
the capillarity separation ratio, and 
Oo AT 
ii OT pV: 


These equations show that, in the Boussinesq 
framework, the flow physics does not depend on 
po, To, and Co, except through the material proper- 
ties which enter the numbers. 


Linear Stability 


Given a base state S —(0,0,C), a solution of [8], 
[16|-[18], how does it behave in presence of an 
infinitesimal disturbance (67,66,6C)? Applying [8], 
[16]-[18] to (v + óv, O + 60,C + 6C) and discarding 
the quadratic terms in perturbation provide the 
disturbance temporal evolution, 


— 


V . (60) — 0 [21] 


j f & [9 sð 
z; | 3 =F + (ep. V) e|-«A[5e| (22) 
Ot 6C C 6C 


where F — (— V(6P), 0, 0)', and 


Bp, RaPre, Ra Pr Upee, 
A= 0 B, 0 [23] 


0 A Bre 


with B, = — (v - V) + a. The perturbations (dv, 60, 
óC) have the (7,0,C) boundary conditions, but 
homogeneous. On a free surface, the perturbation 
capillary stress condition is 


i. |V e 6 + (V.& 68)" it 
= -Ma((V . £)60 + Ve(V - ec) [24] 


Recasting [21]-[23] provides 


j [| 6U 
3, | 80 | - £(5) | se [25] 
* X B 8C 


whose solution is 
óv(t) _ [ dv(t = 0) 
Olt) | = e*9* | 60(t = 0) [26] 
óC(t) óC(t — 0) 


Direct System 


£(S) is made of V acting on the initial perturbation. 
Conclusions about S stability depend on the sign of 
Amax, the real part of the leading eigenvalue of £ 
found with all the possible perturbations. There is 
stability if Amax < 0. At Amax=0, the marginal 
stability, the bifurcation threshold is located at 
Ra (Pr, Le, Vp, Vc, X) = Rac, Ra--being the critical 
value of the control parameter, X containing all 
the other parameters of the problem (container 
aspect ratios, etc.). The nonlinear-stability analysis 
in the vicinity of Ra, supplies € in Amax X 
(Ra— Ra,), which is characteristic of the 
bifurcation. 


0.75 


0.5 


0.25 


Figure 2 Leading axisymmetric thermal adjoint eigenvector 
(Courtesy of O Bouizi and C Delcarte). 


Adjoint System 


The leading left eigenmode complex conjugate 
supplies the response field of the base state to the 
most destabilizing punctual disturbances. 

The S state and £ eigenspace analytical determi- 
nations are often impossible. One must resort to 
specifically designed numerical tools. A numerical 
adjoint eigenvector is presented in Figure 2 for a 
(Ma — 106, Pr — 10?) side-heated cylindrical liquid 
bridge, with a free surface on the right and the axis 
on the left. 


Nonlinear Stability 


When Amax > 0, the associated, disturbance exponen- 
tially grows with time, until nonlinearities become 
essential. The flow progressively evolves from S 
towards a new state, S’, which is a solution of [8], 
[16|-[18]. How can one proceed analytically to 
know how the nonlinearities control the bifurca- 
tion? A large number of S — S' bifurcations exist, 
with either both S,S’, steady or unsteady but with 
different flow structure, or one is steady and the 
other is not. Bifurcations can also be reversible or 
hysteretic, with respect to Ra. The symmetries of S 
play an important role and non-Boussinesq effects 
change the thresholds and the nature of bifurcation. 

Landau’s works have opened up the way to the 
theory of nonlinear hydrodynamic stability. The 
ruling equations are reduced, using an appropriate 
expansion method, to a set of ordinary differential 
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equations describing the temporal evolution of 
amplitudes, Aj, i= 1,2,...,I, characterizing the per- 
turbation eigenmodes, 


dA; 

dt 
where N accounts for the nonlinear action of the I 
modes on Aj, and the A;'s are the temporal growth 
rates coming from the linear theory. The stability of 
the steady solutions, dA;/dt=0, is determined by 
local analysis. With one destabilizing mode, the 
simplest model is dA/dt = AA — aA|A|, with a > 0, 
constant, specific of the bifurcation. Symmetry con- 
siderations (some of them directly originate from the 
Boussinesq framework) may impose a — 0, whereby 
the simplest model becomes dA/dt = AA + 8A?, with 
B another constant. 

When the flow is weakly confined in one or two 
space directions, boundary effects can play a subtle 
dynamical role, allowing, for instance, the existence 
of multiple solutions, each one made of many 
interacting modes. A large variety of flow regimes 
is then observed, as steady/traveling, extended/ 
localized wave packets, particularly in binary mix- 
tures. Spacetime models, close to [27], such as the 
Ginzburg-Landau equation, 

OA 07A 

m = = AA -G--— Ad 
are derived for describing the dynamics of the wave 
packet envelop (of complex amplitude A). 


= \;A; + Ni(A;) for LfoL.od [27] 


+ BAPA 


Hydrostatic State Stability 


The static-state stability is analytically tractable in 
unbounded volume. Transverse wave (by [21]) 
solutions are the potentially destabilizing perturba- 
tions, with wave vector k and complex frequency w. 
The system [22]-[23] gets simplified, and £ becomes 
algebraic upon substituting (ik, iw) for (V, o/t). 
Intuitively, the quiescent state loses its stability when 
Vp(p, T, C) - Vw exceeds a threshold value (positive, 
by the dissipative effects). This analysis supplies it, 
together with the data of the oscillatory motions 
emerging at onset from the rest-state instability. 

In reality, the fluid is confined to three dimen- 
sions, possibly with free surfaces, and wave solu- 
tions are no longer usable. The first approach 
consists in defining a simplified model confined to 
one dimension. The perturbations must satisfy 
homogeneous boundary conditions, and/or [24], 
and they are waves in both other space directions. 
The resulting problem may be analytically tractable. 
The stability of many quiescent-state configurations 
was studied, for fluid layers of infinite or very large 
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extension, of pure-fluid/mixtures, with/without free 
surface. Nonetheless, many other configurations are 
not yet analyzed. Two- and three-dimensional cases 
must be numerically treated. 


Gravitational Buoyancy Convection 


Among the numberless thermal situations to ana- 
lyze, research mainly favored the case where the 
fluid is confined in simple geometries and submitted 
to two distinct heating directions, VT being either 
aligned or normal to T, that is vertical or horizontal 
in the gravity field. Each case leads to specific 
thermohydraulics. The rest-state stability is the first 
analysis step of the former case, the first to be 
experimentally studied by Bénard in 1900, with a 
horizontal liquid layer. The latter is of more recent 
interest, with Batchelor's theoretical work on the 
parallel convective regimes of pure fluid confined in 
tall slot. Since then, a large amount of work has 
been published on those cases, tackling various 
confinement geometries, and involving high Ra 
values. This problem became the paradigm of the 
rich spatiotemporal behaviors arising in nonlinear 
systems driven away from equilibrium. In binary 
mixtures the complexity of the dynamics increases 
considerably. The literature is so far practically 
devoid of any three-dimensional results in mixtures. 
Ternary mixtures have so far been only scarcely 
considered. 


Steady Parallel-Flow Model 


This analytical approach comes from an interesting 
Batchelor's remark made about the vorticity but 
here applied to the velocity of a confined flow. *A 
number of flow fields are characterized by values of 
the magnitude of the" velocity *in the neighborhood 
of a certain line in the fluid which are much larger 
than those elsewhere,” and (by V-v=0O) “this line 
of necessity" is parallel to 9 and to the container 
walls. 

Buoyant forces may contradict this assertion, 
particularly in Rayleigh-Bénard configuration with 
imposed temperatures. There, no parallel solution 
exists. Nevertheless, steady parallel flows do exist in 
containers. The thermally active walls (whatever 
they be - the largest or smallest) are either 
maintained at constant temperatures, or subjected 
to a constant heat flux. Figure 3 sketches a cross 
section (hereafter referred to as the vertical mid- 
plane) of such a configuration, with active (uniform 
heating g) vertical walls. The other sides are 
adiabatic. No rest state is allowed here. Although 
intrinsically three dimensional, the steady regime in 


L«H 


Figure 3 Sketch of the cross section of a slender vertical 
container. 


this cavity can be fairly well approximated as 
two dimensional (in the vertical midplane), and 
moreover mainly parallel to the active walls, in an 
Ra range which increases with the aspect ratio, H/L. 
The influence of the horizontal sides is of limited 
range compared to the flow extension, H. The 
parallel flow is then the one-dimensional approx- 
imation of what occurs in the major part of the 
cavity. This configuration is taken with a binary 
mixture for illustrating an approach applicable with 
minor variations in other situations. 

The problem becomes linear. Indeed, v= :w(x)e; 
by V-v=0. Taking AT=qL/pr as temperature 
scale, [16]-[18] imply 


O(x,z) = Grz+O(x),  C(x,z) = Gez + C(x) 
with Gr, Gc as constants. The impulse balance is 


d^w 


dub = —Ra (Ôx) + WeC(x) [28] 
and the ruling equations 
4 
dL —Ra|\Gr m ice + Gc) LU 
dx? Le 29 
d°O d°C 
wg = i w(Gr + Gc) = Le 73 


An internal length scale is predicted, of thickness 


Up —1/4 
Ra (6; + Te (Gyr + 3l 


By [28] and [19], the thermal flux condition yields 


d'w 


E = —Ra (1 4- Vg) 


X4:1/2 


A last operation allows to determine Gr and Gc. 
The overall heat and mass fraction balances are 
performed in the cavity part (V), which is bounded 
by an horizontal plane located within the parallel- 
flow region. Since the walls are impervious, the 
solute is transported only across the lower boundary 
of (V), through which the net vertical convective 
supply must be balanced, in steady regime, by 
vertical diffusion. The heat balance works similarly, 
since the walls are adiabatic or submitted to equal 
fluxes. Whence the relations, 


1/2 . 
/ w(x)C(x) dx = Gr + Gc 
J-aj 


The steady parallel flow is determined. Its stability 
can be analyzed as indicated in the section “Linear 
stability.” 

Some caution must be taken for the Boussinesq 
approximations to be valid here, with the tempera- 
ture and mass fraction increasing constantly (by 
Gr, Gc) along the direction of largest cavity exten- 
sion. These gradients are at the origin of the 
“thermogravitational column” separation power, a 
device designed for the isotope separation. Extre- 
mely long columns can provide almost complete 
separations, with ac|AC| no longer <1, and then 
the non-Boussinesq effects occur. 

As an illustration of aforementioned notions, let 
us consider the (Pr=1, Le— 0.1) Rayleigh-Bénard- 
Soret (RBS) problem where horizontal solid plates of 
infinite extension are uniformly heated from above 
(Ra « 0) or below (Ra » 0). This configuration is 
simply obtained by rotating the cavity in Figure 3 by 
+7/2 with respect to g and to (é,,é,). The steady 
parallel-flow model can lead to the right-hand side 
of an equation like [27] governing the time evolu- 
tion of A, the parallel-flow amplitude, 


dA 23 i 1-r\ ,5 
Ale A Zu i2 )^ 
taf -z) [30] 
Fc 
where 
315 Ra y 
ME g’ r= 320 re = [1 + Yg(1 ++ Le !)| 


Here r, is the critical value or r where the rest state 
loses its stability towards a steady parallel flow. The 
roots of dA/dt=0 are A = Ao = 0, A = A; (r, Le, Vg), 
for the quiescent, convective states. Figure 4 shows 
that Ago —0 and the curves Aj(r) for several 
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Figure 4 Bifurcation diagram of Ao(r) and Aj(r) for various 
separation ratios Vg(Le). 


Wp(Le), o1— —(1-- Le!) being the re pole. The 
solid (dotted) parts correspond to the stable 
(unstable) steady states, emerging from direct (back- 
ward) pitchfork bifurcations of the rest state at re. 
Saddle-node bifurcations from unstable to stable 
steady states are also predicted, on the dashed curve 
of the equation 


QUEEN. r — (1 + Le) 


Fully Nonlinear Problem 


Numerical tools are required for solving the system 
[8], [16]-[18] and analyzing the stability of the 
flows obtained. 


The RBS Case Let us illustrate how the rest-state 
loss of stability occurs in the two-dimensional RBS 
case, with a (Pr— 1, Le=0.1, Vg = —0.2) mixture. 
The flow lies in the meridian plane of an axisym- 
metric container with the radius/height ratio equal 
to 2. No-slip conditions are imposed on impervious 
walls; the temperature on the bottom plate is higher 
than on top, and the peripheral wall is adiabatic. At 
t —0, the quiescent state is given a small random 
perturbation. The system evolves (Figure 5) towards 
a stable periodic solution via a transient regime of 
exponentially amplified amplitude (eqn [26]). One 
speaks of a Hopf bifurcation for a steady (here 
quiescent) state destabilization by  oscillatory 
disturbances. 

The “instantaneous” frequency (from the time 
running between two successive identical passes of 
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Figure 5 Time evolution of a radial velocity nodal value for Ra = 2600. Reproduced from Millour, Labrosse, and Tric (2003) Physics 
of Fluids 15(10): 2791—2802, with permission from American Institute of Physics. 
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Figure 6 Instantaneous angular frequency w, corresponding to Figure 5. Reproduced from Millour, Labrosse, and Tric (2003) 
Physics of Fluids 15(10): 2791-2802, with permission from American Institute of Physics. 


the signal) evolves with time (Figure 6) from its 
threshold value to its nonlinearly saturated one. 
Accurate determination the thresholds and identi- 
fication of the associated bifurcation is possible by 
fitting the argument € of Amax(Ra) from the 
exponential growth of Figure 5, in the Ra, vicinity. 
Figure 7 shows (solid dots) A(Ra) measurements, 
and the solid line (in Figure 8 also) is the linear law 
given by the two points closest to the vanishing 
growth rate. The local law announced in the 
subsection “Direct system” is confirmed, with 


an exponent €=1 for the Hopf bifurcation, and 
£— 1/2 for saddle-node (Figure 8) and pitchfork 
bifurcations. 


The Thermally Driven Cubic Cavity All flows are 
obviously three dimensional. When do they possess 
a two-dimensional approximation? How to qualify 
it? Clearly, the flow that develops in the container of 
Figure 3 might enjoy (in a given parameter domain, 
D) the mirror-reflection symmetry property about 
the vertical midplane. Is there a two-dimensional 
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Figure 7 Temporal growth rate, A, of infinitesimal perturba- 

tions, in the vicinity of the Hopf bifurcation of the quiescent state. 

Reproduced from Millour, Labrosse, and Tric (2003) Physics of 

Fluids 15(10): 2791-2802, with permission from American 

Institute of Physics. 
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Figure 8 Squared temporal growth rate, A?, of transient 
relaxation towards the stationary state close to the saddle— 
node bifurcation. Reproduced from Millour, Labrosse, and Tric 
(2003) Physics of Fluids 15(10): 2791—2802, with permission 
from American Institute of Physics. 
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approximation of the flow in this midplane? Is it 
able to give a correct estimate of the two-dimen- 
sional flow stability within D, and to predict the D 
frontiers, where the mirror-reflection symmetry 
property ceases to be valid? Only partial answers 
are available so far, coming from the thermally 
driven cubic cavity (Figure 9). 

Filled with a pure fluid, its left and right vertical 
plates have fixed temperatures, Ty (Q=0 at x — 0) 
and To + AT (O—1 at x— 1), while the others are 
adiabatic. Any ATÆ 0 generates a flow, possibly 
mirror-symmetric about the vertical (hatched) 
midplane, and also centrosymmetric about e,. The 
two-dimensional approximation was extensively ana- 
lyzed, numerically, with air as a fluid. A steady flow 
is obtained for Ra < Raop,. — (1.82 4- 0.01) x 105, 
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Figure 9 Sketch of the thermally driven cubic cavity. 


where an oscillatory regime appears. The numerical 
three-dimensional flow is steady until Ra3p,.— 
3.2 x 107, where it hysteretically bifurcates towards 
an oscillatory regime breaking the mirror symmetry 
about the midplane. Let us assess the validity of the 
two-dimensional approximate solutions. We define 
dimensionless heat fluxes (Nusselt numbers) which 
penetrate in one of the active walls, 


! OO3p 


Three fluxes are interesting to compare: (1) in the 
midplane, Ntmp = Nu(y = 1/2), (2) globally Nusp,w = 
P Nu(y)dy, and (3) the two-dimensional 
approximation 


Nun w = / 992p 
0 


dz 


x=0 


dz 


Ox x=0 


Figure 10 shows how they compare themselves, 
as a function of Ra. Quantitatively, the two- 


A 100 (NU, w— Nuso w)/ Nusb w 
m 100x(Nu,, w~ Nu, )/Nu,; w 
@ 100x(Nu,s w— Nu, )/ Nuz w 
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Figure 10 Relative 2D-3D Nusselt numbers. Reproduced with 
permission from Tric E, Labrosse G, and Betrouni M (2000) A 
first incursion into the 3D structure of natural convection of air in 
a differentially heated cubic cavity from accurate numerical 
solutions. International Journal of Heat and Mass Transfer 43: 
4043-4056. © Elsevier Ltd. 
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dimensional approximation is not too bad, but not 
qualitatively, with a nonmonotonic evolution of the 
discrepancies. These latter become quite negligible 
when the three-dimensional flow gets unsteady and 
paradoxically loses the symmetry property on 
which its two-dimensional approximation is 


founded. 


Thermocapillary Convection 


Two immiscible liquids, or a liquid and a gas, are 
separated by a free surface, a region of small 
thickness (some ten molecular sizes). From a 
macroscopic viewpoint, it is considered as a singular 
entity. Its location and geometry are part of the 
solutions of the governing equations, themselves 
supposed to satisfy [20] on the free surface. As a 
first iteration, the free-surface shape can be imposed, 
fixed, and straight often. 

Numerous industrial processes involve thermoca- 
pillarity wherein thermohydraulics involves complex 
phenomena, such as phase-change kinetics. A rele- 
vant modeling of these situations is a research 
subject by itself. For thermohydraulics, some aca- 
demic configurations (Figure 11) have retained the 
attention of the scientific community. 

Any  thermohydraulic flow transfers heat 
between hot and cold solid boundaries wherein 
heat penetrates by conduction. Consequently, the 


(a) (b) (c) 
Figure 11 Open boat ((a) straight and (b) circular) and liquid 
bridge (c) configurations. 
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Figure 12 Thermocapillary origin of vorticity singularity (cold 
wall configuration). 


term (V-£)O of [20] never cancels at the solid 
boundary/free surface junction, as in Figure 12. 

A nonzero vorticity is thus generated by thermo- 
capillarity on the free surface until the wall, while 
flow adherence on the wall gives vorticity values of 
opposite sign. The problem presents therefore a 
vorticity singularity at the triple point. This is a deep 
physical and modeling problem. 


See also: Bifurcations in Fluid Dynamics; Capillary 
Surfaces; Compressible Flows: Mathematical Theory; 
Dynamical Systems and Thermodynamics; Dynamical 
Systems in Mathematical Physics: An Illustration from 
Water Waves; Fluid Mechanics: Numerical Methods; 
Magnetohydrodynamics; Non-Newtonian Fluids; Partial 
Differential Equations: Some Examples; Stability of 
Flows; Vortex Dynamics. 
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Introduction 


The general theory of relativity (GRT) unifies special 
relativity theory (SRT) and Newton's theory of 
gravitation (NGT). SRT and NGT describe success- 
fully large domains of physical phenomena; there- 
fore, one would like to understand how they survive 
as approximations in GRT. 

In GRT, spacetime is idealized as a four-dimen- 
sional Lorentz manifold whose curvature is related 
to the distribution of energy and momentum. In 
such a spacetime, the existence of the exponential 
map implies that the metric near any event (space- 
time point) x deviates from a flat metric only by 
terms given by the curvature there. Thus, if the 
gravitational tidal field, represented by the curvature 
tensor, is small near x, one may approximate the GR 
metric there by a flat Minkowski metric. This 
explains that SRT is a general local approximation 
to GRT. Apart from a remark at the end of the 
subsection “Local laws” the relation GRT — SRT 
will not be discussed further. 

In its traditional formulation, Newton's theory 
differs drastically from Einstein’s theory both in its 
spacetime structure and in its description of gravita- 
tion. The main purpose of this report is to show 
how NGT can nevertheless be understood as a kind 
of *limit" of GRT. More precisely, the structure of 
NGT can be viewed as a degenerate version of that 
of GRT, in parallel to the fact that the Galilei group 
can be obtained by contracting the Lorentz group. 

In the next section we state the laws of GRT. 
We then reformulate these laws with slightly 
different field variables such that, besides the 
gravitational constant k, the speed of light appears 
via A—c^. The resulting laws remain meaningful 
if A and/or k are replaced by zero. They turn out 
to give a common basis for GRT, SRT, and 
NGT. The possibility of such a framework was 
indicated independently by Cartan (1923, 1924) and 
Friedrichs (1927) and extended by several authors; 
the complete formulation reviewed here was given 
by Ehlers (1981). 

The section *Newton's theory in spacetime form" 
shows that the laws of NGT and SRT are obtained, 
with some additional restrictions, from the rescaled 
laws of GRT by putting, respectively, \=0 or & — 0. 
It is emphasized that Newton's theory proper is a 


theory only of isolated systems. Its intrinsic, four- 
dimensional formulation explains how the distinc- 
tion between a vectorial gravitational field and 
inertial forces, as well as the existence of inertial 
frames, emerge as consequences of asymptotic 
flatness. These structures are lost in the so-called 
^Newtonian" cosmology whose dynamics is due to 
symmetry assumptions, whereas GR cosmology is a 
proper part of GRT. 

The penultimate section is concerned with rela- 
tions between solutions of GRT and NGT, and in 
the final section some results related to solutions are 
reported. They illustrate that the limit relation 
GRT — NGT may sometimes be inverted to get 
exact or approximate GR results from NGT. 
Approximations are related to uniform convergence 
in A, as is indicated at the end of the final section. 

The limit relations described here may be con- 
sidered as a model for other theory relations in 
physics such as quantization or dequantization. 

Notation Indices will be considered in general as 
“abstract” ones, characterizing the kind of objects 
independent of coordinate systems. Greek indices 
refer to spacetime, Latin ones to 3-space. Fields on 
spacetime will generally be taken to be smooth. 


Basic Concepts and Laws of GRT 


According to GRT, spacetime is a four-dimensional 
manifold M endowed with a Lorentzian metric gag, 
here taken to have signature (+ + + —). Any kind 
of matter including nongravitational fields is sup- 
posed to determine an energy tensor T°’. Metric 
and matter are interrelated by Einstein’s gravita- 
tional field equation 


Sirk 1 
Rag = -7 (Tas " 5 8.37 " 


In this equation, T :— T^, denotes the trace of the 
energy tensor, k and c stand for Newton's constant 
of gravity and the speed of light, respectively, and 
the Ricci tensor Rag is obtained from Riemann’s 
curvature tensor by contraction 


Rag eum R ang 


The curvature tensor is constructed from the 
symmetric, linear connection I'5^. determined by 
the metric. 

Equation [1] implies the vanishing of the covar- 
iant divergence of the energy tensor 


T"? s = 0 n 
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the GRT analog of the laws of local conservation of 
energy and momentum. 

The energy tensor depends on the kind of matter 
to be taken into account. In this article, only 
vacuum fields (T^? — 0) and perfect fluids will be 
considered. For such a fluid, 


T? z (p ER c *p) p? U? à pe” [3a] 


p and p denote the mass density and the pressure, 
respectively, and the 4-velocity U^ is a timelike 
vector obeying 


gagU^ UP = -e [3b] 


If thermodynamical relations are added to specify 
the kind of fluid — the simplest cases are barotropic 
equations p=f(p) — then eqns [1]-[3] admit a 
well-posed initial value problem for the fields 
Sais Uu, p- 

Different matter models which could be treated in 
the context of this report are elastic bodies and ideal 
gases, but not point particles. Point particles fit into 
GRT even less than into electrodynamics. 


The Cartan-Friedrichs Formalism 


To obtain a spacetime formulation of NGT and a 
limit relation ART — NGT, we recall that the 
metric structure of Newton's spacetime consists of a 
scalar t, absolute time, which foliates M into 
instantaneous 3-spaces $, and Euclidean metrics 
^4p(f) on these spaces. If the inverses y(t) are 
pushed forward onto M via the embeddings S, — M, 
a field s^" on M results which is assumed to be 
smooth. By construction, 


SP g = t) [4| 


The pair (t,s°”) defines the “metric,” that is, times 
and distances, in NGT. 

Such a structure can arise from a Lorentzian 
metric, for example, the Minkowski metric nag, by 
taking, component-wise, the limits 


—c hag dx? dx? 


=d =g dr aao di’. fas sap [5] 
which can be interpreted geometrically as “opening 
up the light cones" until they degenerate into 
doubly covered, spacelike hyperplanes, the New- 
tonian $;'s. 

The relations [5] suggest to write the GRT laws in 


terms of the rescaled temporal metric (A = c?) 


lag =— — Agag [6] 


and to write — presently only as a change of 
notation — s^^ instead of g^". Then the fields 
tag S, Tg" T, o, p, U^, called the basic fields 
below, and constants k>0,àå>0 satisfy the 
following laws: 


bes” = —A6", [7a] 

fapa = 0, a imd [7b] 

R° gs = RG" [7c] 

Rag = 8k (a — 5 tasty) p? [7d] 
T99.9 =0 [7e] 

T^? = (p + Ap) U^ U? + ps” [7f] 
taU U^ = ] [7g] 


The Lorentz signature of g,; can be reexpressed 
thus: at each event (= spacetime point), there exists 
a *timelike" vector V^, that is, 


t.gV*V" > 0 [7h] 


and V°X,=0 for X, #0 implies s°’X,X, > 0. 

The indices in eqn [7c] are raised, here and later, 
by s?^. 

Given a set of basic fields on M as listed below 
eqn [6], the laws [7] remain meaningful for all \ > 0 
and k>0O. If A—0, the “metrics” tag and s^? 
degenerate (and the pair (£,5,s^") is then called a 
Galilei metric). Nevertheless, the definition of “time- 
like" will also be used in that case. Also, X^ will be 
said to be “spacelike” if and only if it can be written 
X? =s€3 with s"?£,£5 > 0. While for A > 0, some 
of the relations [7] are redundant, this is not so for 
A — 0. For example, if A— 0, the two eqns [7b] are 
independent and do not determine the connection 
Ta“, uniquely, in contrast to the case A » 0. The 
connection will always be assumed to be symmetric. 

As will be discussed below, these formulas define 
a framework which serves to relate GRT to NGT 
and special relativity (SRT). First steps to formulate 
such a framework have been taken independently by 
E Cartan and KO Friedrichs. Therefore we call the 
structure defined by [7] the Cartan—Friedrichs 
formalism (CFF). We call it a “formalism” and not 
a "theory" since it is of interest solely as a tool to 
study relations between theories. 

Equations [7] remain unchanged if the basic fields 
and constants are rescaled according to a change of 
units for time, length, and mass. Here, two sets of 
basic fields related by such a rescaling will be 
considered as physically equivalent; they provide the 


same relations between observables. Thus, A and k 
have no physical meanings, but only their signs: 


A>0O,.k>0: GRI 
A-0.bk50: NGT 
A0. R20: SRT 


(The last two lines are not sufficient to specify the 
theories within CFF; in connection with eqn [9] and 
in Theorem 2 they will be completed.) For discuss- 
ing limit relations between theories, it is nevertheless 
useful to represent physical models in different 
scales. 

The physical interpretation of t,,3,s°” in terms of 
time and distance and that of [3°, through its 
geodesics as world lines of freely falling test 
particles, respectively, is the same in the three 
theories and can be stated in terms of the common 
framework CFF. 

For an obvious reason, À may be called causality 
constant. Note that À and k each occur in only one of 
the general laws of the theory, apart from the A in [7f]. 

The laws [7] are invariant under diffeomorphisms 
of the spacetime manifold. Those diffeomorphisms 
which map the basic fields of a solution into 
themselves form the symmetry group of that solution. 


als 


Newton’s Theory in Spacetime Form 
Local Laws 


Remarkably, for A=0 and k > 0 the formulas [7] 
reproduce almost all the laws on which Newton’s 
theory of spacetime coupled to Euler’s fluid theory is 
based. This is summarized in the following: 


Theorem 1 Let eqn [7] bold on M with X\=0. 
Then there exists, for any event of M, a neigbbor- 
hood U with coordinates (x*,t) such that, on U, t 
coincides with the absolute time, ty; =t at, 8, and on 
the local slices UN S, s^? defines Euclidean metrics 
^, with orthonormal coordinates x*, 71 = 6,5. 
Vectors are spacelike iff they are tangent to S 
otherwise they are timelike. Moreover, the slices 
are locally geodesic with respect to the connection 
C^, and the induced connection on tbe slices is the 
flat connection associated naturally to ^. In 
addition, in the coordinate chart given by (x*,t), 
the connection components vanish except Tofo and 
Coal = —Io",). Therefore, t is an affine parameter 
on timelike geodesics. Further, U? — 1, and U’ =v" 
is the 3-velocity of the fluid. If one writes 


-T00 =: g^, To'p =: wf, [8a] 
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and uses 3-vector notation with — (g^)—g, 
(w23,W31,W12)=@, the timelike geodesics of V^. 
are given by 


x—g--2xxo [8b] 
g and @ satisfy 


V.0—0, Vxg+2@=0 [8c] 


Vx@=0, V.g 2a = —4rkp [8d] 

and the fluids equations of motion are 
p V. (pv) =0 [8e] 
p(v--v:.Vv-g—2vxo)-- Vp —0 [8f] 


A solution (g,@,p,p,v) of eqns [8] on a local 
chart (x^,t) with ta,—diag(0,0,0,1) and s"?^— 
diag(1, 1, 1,0) provides, via eqn |8a], the general 
local solution to eqns |7] for A — 0. 


The proof consists of many, mostly elementary 
steps which can be gathered from Künzle (1972) and 
Ehlers (1981). 

Given a solution to eqns (7) with A — 0 and k > 0, 
the coordinates x^ = (x^,1) referred to in the theorem 
are determined by the basic fields up to time- 
dependent Euclidean motions, time translations, and 
time reflections. Such a coordinate system corresponds 
to a rigid reference frame. As the equation of motion 
for freely falling particles, eqn [8b], shows, g and @ 
are to be interpreted as the acceleration and rotation 
fields which determine, relative to a rigid frame, the 
combined influence of inertia and gravity on particles 
encoded in the spacetime connection I'5".. (This role 
of a connection in NGT was recognized by E Cartan.) 
This interpretation is supported by the (generalized) 
Euler equation [8f]. 

As claimed above already, eqns |7] almost 
reproduce the local laws of the Newton-Euler 
theory. Indeed, eqns [8] are those of the Newton- 
Euler theory, provided @ depends on time only. 
Then and only then can the coordinate freedom be 
used to get nonrotating rigid coordinates with 
respect to which @=0. The existence of such 
coordinates is indispensable for NGT since only 
with respect to them -g is the gradient of a 
potential U which obeys Poisson’s equation, as 
shown by eqns [8c] and [8d]. 

The preceding argument shows that the CFF, 
specialized to A—0, has to be restricted by a 
condition which implies 9 — e(t) in order to give 
the local laws of NGT. One such condition is 


Rs = () [9] 
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as can be verified by computing the curvature tensor 
via eqn [8a]. 

Equation [9] for A—0 expresses that parallel 
transport of spacelike vectors along arbitrary spacetime 
curves is integrable, which corresponds to the behavior 
of free gyroscopes in NGT (in contrast to GRT). 

Of course, eqn [9] cannot be added to the CFF since it 
is incompatible with GRT. If, however, the CFF with 
à >0,k=O0 is restricted by the condition [9], the 
spacetime and hydrodynamics of special relativity result. 


Global Laws for Isolated Systems 


The laws [8] and [9] do not determine the time 
evolution of the basic fields. Using nonrotating 
coordinates we put g — —VU and replace eqns [8c], 
[8d] by Poisson's equation 


AU = 4rkp [10] 


In Newtonian dynamics, the potential only serves 
to compute forces depending instantaneously on the 
mass distribution. Traditionally, this is achieved by 
assuming p to have spatially compact support at 
each time and to solve eqn [10] by 


peð d an pin + yyt) 11] 
yl 
which implies the fall-off 


jum (x,t) — 0 [12] 


(ġ will always be used for this solution of eqn [10]). 

To relate the foregoing isolation assumptions 
to corresponding assumptions in GRT as far as 
presently possible, it seems necessary to go back 
to the laws [7] restricted to A=0 or the equivalent 
(3 + 1) version [8] without the restriction [9]. 

If some global assumptions are added to eqns [8], 
eqns [10]-[12] can be deduced from the four- 
dimensional formulation. One first introduces the 
following two assumptions: 


(1) The hypersurfaces S; of M (which, for \=0, are 
the only spacelike hypersurfaces) are simply 
connected, complete Euclidean spaces. 

(2) On each S,, the support of p is compact. 


Using coordinates (x^,1) as in the last subsection, 
with x^ now ranging on R^, eqns [8a] imply 


R* gR a. = 2 N (wab) ts [13] 
a,b 


Hence the sum is a 4-scalar, and since tag is 
covariantly constant, it is possible to require 


R” gs RP. 7,—0 at spatial infinity [14] 


which expresses covariantly that w4, p — 0. Since @ is 
harmonic on $, (by eqns [8c], [8d]), this in turn 
implies wa,p=0; thus, @ depends on ¢ only; the 
asymptotic condition [14] and the local laws imply 
eqn [9]. 

We may therefore employ rigid, nonrotating 
coordinates, € — 0. Then, by eqns [8a], [8c], [8d] 
the connection coefficients take the form 


T335, = tata P Ll [15] 


and 


R^, a R" s = tat at La ` P» Ua [16] 


ab ab 


As before, we require 
R^ oua Rye 0 [1 7] 


and conclude U 4p — 0. Since the Newtonian poten- 
tial ó of p also has this fall-off and U — à is 
harmonic on $, = RP, the following conclusion can 
be obtained: 


Lemma 1 The laws |8] and the global conditions 
(1)-(2), [14], [17] imply: in rigid, nonrotating 
coordinates, the connection 


D'3*, — tata 6.5 — T'g* [18] 


is flat (ġ according to eqn |11] is a scalar, and the 
-term in eqn [18] is a tensor). In other words, 
Faa is asymptotically flat since the -term falls of 
as |x| *. 


Because of this lemma, one can further restrict the 
coordinates (x^, t) by demanding 13°, = — 0. In physi- 
cal terms this means: by switching to a new, 
*unaccelerated" frame of reference, one removes 
from the equations of motion a spatially homo- 
geneous gravitational field which, in contrast to the 
ó-term in eqn [16], is not due to matter. 

The resulting coordinates are defined, up to 
Galilean transformations, 


f=+t+c" 
x^ — D* x? -- ut 4- c 


where c^,u^ are constants and D is a constant 
orthogonal 3x3 matrix. These coordinates are 
called inertial ones; with respect to them the usual 
laws of Newtonian mechanics hold; see [8] with 


Q — 0 and U = à|[p]. 


Theorem 2 (Ehlers 1981). The laws |7] of the 
CFF restricted to \=0 and augmented by the global 
and asymptotic conditions (1)-(2), [14], [17], 
provide a generally covariant, four-dimensional 


formulation for the Newtonian theory of space, 
time, gravitation, and hydrodynamics. 


The possibility to split the connection P into a flat 
part which is independent of matter and a tensorial 
part depending on matter and given by the vector 
field g^ — s^?^$ 4 (with ġ from eqn [11]), arises only 
from supplementing the local laws [7] by the global, 
resp. asymptotic, conditions (1)-(2), [14], [17] 
stated above. The introduction of inertial coordi- 
nates is then convenient, but not necessary. In 
noninertial, rigid frames of reference, DL';^. gives 
rise to inertial forces. 

It should be possible to define spatial asymptotic 
flatness in the CFF, but that has not been done. 


Remarks on Newtonian Cosmology 


In cosmology, the conditions (2) and [17] of the 
last subsection are not appropriate. Instead one 
keeps the laws [7] and adds to them eqn [9], so 
that with respect to nonrotating coordinates the 
laws [8] with @=0 and eqn [10] remain valid. 
Then, there are no longer inertial coordinate 
systems, and the potential U is not a 4-scalar. 
For a slightly different approach, see Rüede and 
Straumann (1997). 

For the purpose of this article, the term 
“cosmological model" will be applied to those 
solutions of the laws [7] and [9] which satisfy p > 0 
and which have a symmetry group which acts 
transitively on the set of world lines representing 
the motion of the fluid. This strong symmetry 
assumption determines the time-evolution even in 
the “Newtonian” case A — 0 in spite of the absence 
of an evolution equation for the gravitational 


field g. 


Newtonian Limits of Families 
of GR Solutions 


The discussion in the sections “The Cartan- 
Friedrichs formalism” and “Newton’s theory in 
spacetime form" suggests the following: 


Definition 1 Let a family £(AX)—(t,5(A),...) of 
basic fields parametrized by A, obeying the laws [7] 
of the CFF, be given for 0 < \ < a. We assume the 
underlying manifolds M(A) to be open submanifolds 
of a fixed manifold M such that M(A4) D M(A2) if 
Ay € Az and LJ, M(A) 2 M. Then we write 


lim F(A) = F(0) 19) 


if the fields of F(A) and their first derivatives 
converge pointwise to those of F(0). 
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F (0) is then said to be a CF limit of the sequence of 
(A-rescaled) solutions F(A) of GRT. If the fields of a 
A-family of GR solutions (A> 0) and their first 
derivatives converge for A — 0 locally uniformly, 
then the limit fields satisfy eqns [7]. If F(0) has the 
additional property [9], the limit is locally Newtonian. 

On the basis of the section “The Cartan- 
Friedrichs formalism" one may conjecture that if 
eqn [19] holds and the F(A) for A > 0 are spatially 
asymptotically flat, (0) will represent an asympto- 
tically flat Newtonian spacetime. Examples such as 
Example 1 below are in agreement with this 
conjecture, but a general proof is not known. 


Example 1 The interior solution for a static, 
spherically symmetric fluid ball of constant energy 
density (Schwarzschild 1916) is given by 


ds* = La + r^ (dV? + sin? 0 dy?) 


-— j (3a) — ay c? dt? 


pss const. > 0, ELE 


1/2 
U Š B alr) = (1- Fhe rp) 


7 Jao ede 
ag = a(ro) 


Inserting into these expressions the parameter 
A-—c? and treating p and ro as A-independent 
constants results in a A-family with 0c A« 
((81/3)krip) !. The limit solution represents a 
Newtonian fluid ball of constant mass density p. 
The Schwarzschild vacuum fields belonging to these 
fluid balls also have the appropriate Newtonian 
limits. The resulting complete spacetimes are asymp- 
totically flat. A dimensionless small parameter 
which could be used instead of A to measure the 
deviation of the GR solution from its Newtonian 
limit is the ratio of Schwarzschild radius and the 
geometric radius: 


2kM 8r kpr 


c^ro 3 c 


Example 2 A Friedmann-Lemaitre cosmological 
model of GR containing dust and radiation is given by 


6,, d£" d£" 
(1 — (1/4)(E/2?))&&n£^) 
where R(t) obeys 


5 8r, /M S o 
R P (x* )-s 


ds? = R*(t) -edt 
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M is a mass constant, p — M/R? is the mass density 
of “dust,” S is an entropy constant, e= S/R^ the 
energy density and p=(1/3)e the pressure of 
radiation; and E is a constant of dimension 
(speed). The world lines of the fluid elements 
are given by £^—const. (Lagrangian comoving 
coordinates). 


Taking E, M,S constant and A — c as a parameter 
provides a A-family of GR models with Newtonian 
limit. In the limit, £ is the Newtonian time, and the 
spatial metric R26,,d£^d£^ describes an expanding 
Euclidian space R? (if E <0) or an open ball of 
radius 2R(t) in it (if E > 0). In the coordinates (£^, t) 
the connection does not have the “Newtonian” 
components [8a], instead its nonvanishing compo- 
nents are Tof, — (R/ R)ó;. In local inertial coordinates 
x^ = RẸ" centered on the particle with £^ — 0 (which 
could be any particle because of the homogeneity of 
the model), the spatial metric is dx^, and the 
connection components are  Newtonian, with 
U —(27/3)kpx^ and AU=4rkp. In the limit, the 
radiation no longer influences the expansion; one gets 
the Newtonian dust models (eqn [9] is satisfied). The 
connection is, of course, not asymptotically flat. The 
curvature tensor R^;j';—(4m/3)kptass"" exhibits 
homogeneity and isotropy. The Gaussian sectional 
curvature of the 3-space at time t is K = —\E/R*. As 
a dimensionless smallness parameter one can take 
E/c. In the “open” models, with E> 0, the 
coordinates £^ cover the whole 3-manifold of fluid 
particles, while in the “closed” case, E < 0, one 
particle, the antipode of £^ — 0 on the 3-sphere, is not 
covered. That particle is missing in the Newtonian 
limit model. In the Newtonian case the expanding 
Euclidian space R? can be replaced by a torus; in the 
GR cases this is possible only for E — 0. 

Many examples of GR families with Newtonian 
limits are known (see, e.g., Ehlers (1997) and 
references therein). An example of a A-family 
which has an almost Newtonian limit which does 
not satisfy eqn [9] is provided by NUT spacetimes 
(see Ehlers 1997), interpreted as due to a 
gravitomagnetic monopole  (Lynden-Bell and 
Nouri-Zonez 1998). 


Applications and Problems 


Can one construct, for a given Newtonian solution 
N, a A-family of GR solutions which converges to 
N? Some answers are known and listed below. 

U Heilig (1995) has shown: given a solution to 
the Euler-Poisson equations representing a station- 
ary, rigidly rotating, self-gravitating fluid body 
with its surrounding gravitational field, there exists 


a A-family of corresponding solutions to the 
Einstein-Euler system having the given solutions 
as its limit. 

The proof is based on the fact that one can 
reformulate eqns [1], [2] in terms of harmonic 
coordinates and new dependent gravitational vari- 
ables instead of g, 3 such that the new equations 
given in Lottermoser (1992) are analytic in \ and 
reduce, for A — 0, to the Euler-Poisson system. In the 
stationary case these equations are elliptic for À > 0. 
Using appropriate function spaces, Heilig shows, via 
the implicit function theorem, that a solution for 
A — 0 can be extended to small, positive values of A. 
Since L Lichtenstein has constructed solutions as 
assumed in the theorem, the existence of GR 
solutions follows. 

The gravitational part of the system of equations 
referred to above is hyperbolic for A0, but 
becomes elliptic for À—0, whereas the fluid equa- 
tions remain hyperbolic. In spite of this difficulty 
Rendall (1994) has shown that A-families of time- 
dependent, asymptotically flat solutions to the 
Einstein- Vlasov system representing gravitating 
systems of collisionless particles have Poisson- 
Vlasov limits, and that any Poisson—Vlasov solution 
can be so obtained. 

Lottermoser (1992) succeeded in proving the exis- 
tence of A-families of solutions to the Einstein constraint 
equations which have Newtonian initial data as limits. 
Nothing seems to be known about solutions evolving 
from such data. Lottermoser has given an interesting 
discussion concerning possible extension of his work 
which apparently has gone unnoticed. 

Rendall (1992) has defined and analyzed post- 
Newtonian expansions to Einstein’s equations and 
their solvability, assuming A-families whose tag, s^? 
are a few times differentiable in «= V/A at e=0. He 
found that for low orders the equations have 
asymptotically flat solutions, but that at order eë 
divergences occur for general Newtonian seed 
solutions. Modifications of the method to overcome 
these difficulties have been considered by Rendall 
and others; the problem is open. 

In cosmology, one uses homogeneous back- 
ground models and studies their perturbations. 
The latter are frequently based on Newtonian 
equations. This can perhaps be justified as follows. 
According to Example 2 the fields of Friedmann- 
Lemaitre models differ from their Newtonian limits 
by arbitrarily small amounts uniformly in 
spacetime regions where the terms involving A are 
small, that is, 


3 « R(t), VIE] 


Mc? C 


|x| < R(t) 
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Additional conditions will be needed to ensure that 
Newtonian perturbations approximate relativistic 
ones and that gravitational wave perturbations can 
be neglected. 


See also: Cosmology: Mathematical Aspects; Einstein 
Equations: Exact Solutions; General Relativity: Overview; 
Gravitational Lensing; Shock Wave Refinement of the 
Friedman—Robertson—Walker Metric. 
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Introduction 


The aim of this contribution is to explain how 
Connes derives the standard model of electromag- 
netic, weak, and strong forces from noncommuta- 
tive geometry. The reader is supposed to be aware of 
two other derivations in fundamental physics: the 
derivation of the Balmer-Rydberg formula for the 
spectrum of the hydrogen atom from quantum 
mechanics and Einstein's derivation of gravity from 
Riemannian geometry. 

At the end of the nineteenth century, new physics 
was discovered in atoms, namely their discrete 
spectra. Balmer and Rydberg succeeded to put 
order into the fast-growing set of experimental 
results with the help of a phenomenological ansatz 
for the frequencies v of the spectral rays of, for 
example, the hydrogen atom, 


q q 


y —g(n5—n), nceN,qecZ,gech [1] 


The integer variables nı and m reflect the 
discreteness of the spectrum. On the other hand, 
the discrete parameter q and the continuous 
parameter g were fitted by experiment: q— —2 


and g=3.289 x 10? Hz, the famous Rydberg 
constant. Later quantum mechanics was discov- 
ered and allowed to derive the Balmer-Rydberg 
ansatz and to constrain its parameters: 


m. e 


Anh? (47€9)^ 


422 and g= [2] 


in beautiful agreement with the anterior experi- 
mental fit. 


The Standard Model 


We propose to introduce the standard model (see 
Standard Model of Particle Physics) in analogy with 
the Balmer-Rydberg formula (Table 1). 


Table 1 An analogy between atomic and particle physics elements 


Atomic physics Particle physics 
New physics Discrete spectra Forces mediated by 
gauge bosons 
Ansatz v= g(nj — n3) Yang-Mills-Higgs 
models 
Experimental q= —2, g =3.289 x 10'? Hz Standard model 
fit 
Underlying Quantum mechanics Noncommutative 
theory geometry 
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The Yang-Mills-Higgs Ansatz 


The variables of this Lagrangian ansatz are spin-1 
particles A, spin-(1/2) particles decomposed into left- 
and right-handed components v = (v , vg) and spin- 
0 particles y. There are four discrete parameters, a 
compact real Lie group G, the "gauge group," and 
three unitary representations on complex Hilbert 
spaces HL, Hr, and Hs. The spin-1 particles come in a 
multiplet living in the complexified of the Lie algebra 
of G,A € Lie(G)*. The left- and right-handed spinors 
come in multiplets living in the Hilbert spaces, v. € 
Hi, Wr € Hr, respectively. The (Higgs) scalar is 
another multiplet, p € Hs. The Yang-Mills-Higgs 
Lagrangian, together with its Feynman diagrams, is 
spelled out in Table 2. 

There are several continuous parameters: the 
gauge coupling g € R,, the Higgs self-couplings 
A, u € R4, and several Yukawa couplings gy € C. 


Table 2 The Yang-Mills-Higgs Lagrangian and its Feynman 
diagrams 


LIA, V, v] == 5 tr(0,, A, 0^ A" = 0,,A,0" A") 


+g tr(0, A, | A". A"]) 


+g? tr(LA,, A,][A", A’) 


ALD d 


+u pu 


Hgu(AL © FRA) v 


Hape 


(Gs lAn) 0 p + p" Ps (A, de} 


*39 (Ps (An) PY Ps(A")e 


toy vie gy ve'v 


Let us choose G = U(1) 5 e". Its irreducible unitary 
representations are all one-dimensional, ?4 — C 5 v) 
characterized by the charge q € Z: p(e")wy — ey. 
Then with qu —qg and Hs — (0), we get Maxwell's 
theory with the photon (or gauge boson or 4-potential) 
A coupled to the Dirac theory of a massless spinor of 
electric charge qı, whose (relativistic) wave function is 
i. The gauge coupling is given by g= e///co. Gauge 
invariance of the Yang-Mills-Higgs Lagrangian 
implies, via Noether's theorem, electric charge con- 
servation in this case (see Symmetries and Conserva- 
tion Laws). 

Yang-Mills models are therefore simply nonabelian 
generalizations of electromagnetism where the abelian 
gauge group U(1) is replaced by any compact real Lie 
group. We insist on a compact group because all 
irreducible unitary representations of compact groups 
are finite dimensional. Finally, the Higgs scalar is 
added to give masses to spinors and gauge bosons via 
spontaneous symmetry breaking (see Symmetry 
Breaking in Field Theory). 

We use compact groups and unitary representations 
as (discrete) parameters. One motivation is Noether’s 
theorem and conserved quantities. The other comes 
from Wigner’s theorem: the irreducible unitary 
representations of the Poincaré group are classified 
by mass and spin. Its orthonormal basis vectors 
are classified by energy-momentum and by the 
z-component of angular momentum. This theorem 
leads to the widely accepted definition of a particle as 
an orthonormal basis vector in a Hilbert space 
H carrying a unitary representation p of a group G. 

A precious property of the Yang-Mills-Higgs 
ansatz is its perturbative renormalizability necessary 
for fine-structure calculations like the anomalous 
magnetic moment of the muon. 


The Experimental Fit 


Physicists have spent some 30 years and some 10? Swiss 
Francs to distill the fit (Particle Data Group 2004): 


G = SU(2) x U(1) x SU(3)/(Z2 x Z3) (3| 


Here (575,y,73) denotes the tensor product of an 
n-dimensional representation of SU(2), “(weak) iso- 
spin," an 73-dimensional representation of SU(3), 
"color," and the one-dimensional representation of 
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U(1) with “hyper charge" y. For historical reasons, the 
hypercharge is an integer multiple of 1/6. This is 
irrelevant: in the abelian case, only the product of the 
hypercharge with its gauge coupling is measurable, and 
we do not need multivalued representations, which are 
characterized by noninteger, rational hypercharges. In 
the direct sum, we recognize the three generations of 
fermions, the quarks, *up, down, charm, strange, top, 
bottom," are SU(3) triplets, the leptons, “electron, 
|i, T^ and their neutrinos, are color singlets. The basis 
of the fermion representation space is 


u C t 
n? (). Hi 
(4 B V. 

e Jo H ). d ), 

UR, CR, ER, ER, HR, TR 
dr, SR, br, 


The parentheses indicate isospin doublets. 

The eight gauge bosons associated with su(3) are 
called gluons. Warning: the U(1) is not the one of electric 
charge; it is called hypercharge, the electric charge is a 
linear combination of hypercharge and weak isospin. 
This mixing is necessary to give electric charges to the W 
bosons. The W* and W- are pure isospin states, while 
the Z° and the photon are (orthogonal) mixtures of the 
third isospin generator and hypercharge. 

As the group G contains three simple factors, 
there are three gauge couplings, 


g2 = 0.6518 + 0.0003 
gi = 0.3574 + 0.0001 7] 
g; = 1.218 + 0.01 


The Higgs couplings are usually expressed in terms 
of the W and Higgs masses: 


mw = 352 v = 80.419 + 0.056 GeV [8] 


m, = 2V 2V Av > 98 GeV [9] 


with the vacuum expectation value v:— (1/2)4/ VA. 
Because of the high degree of reducibility of the spin- 
(1/2) representations there are 27 complex Yukawa 
couplings. They constitute the fermionic mass matrix 
which contains the fermion masses and mixings: 

m, = 0.510998902 + 0.000000021 MeV 

Mau =3 E2 MeV, m426::3MeV 

m, = 0.105658357 + 0.000000005 GeV 

m= L25 £01 GeV, 25, — 0.125 + 0:05 GeV 

m, = 1.77703 + 0.00003 GeV 

m: = 174.3 + 5.1 GeV, m, = 4.2 + 0.2 GeV 


For simplicity, we have taken massless neutrinos. 
Then mixing only occurs for quarks and is given by 
a unitary matrix, the Cabibbo-Kobayashi-Maskawa 
matrix 


Vud Vus Vub 
Ckm := | Va Ves Ve [10] 
Via Vis Vib 


whose matrix elements in terms of absolute values are: 


0.222+0.003 0.9742+0.0008 0.040+0.003 


0.9750+0.0008 0.223+0.004 0.004+0.002 
0.009 4- 0.005 


0.039+0.004 0.9992 3- 0.0003 
[11] 


Mathematically, the Cabibbo-Kobayashi-Maskawa 
matrix comes from a polar decomposition of the 
mass matrix. The physical meaning of the quark 
mixings is the following: when a sufficiently 
energetic W^ decays into a 4 quark, this u quark 
is produced together with a d quark with prob- 
ability [Weal an sS quark with probability Lain 
and a b quark with probability |V,,|’. 

The phenomenological success of the standard 
model is phenomenal: with only a handful of 
parameters, it reproduces correctly some millions 
of experimental numbers: cross sections, lifetimes, 
branching ratios. 


Noncommutative Geometry 


Noncommutative geometry is an analytic geometry 
generalizing three other geometries that also had 
important impact on our understanding of forces 
and time. Let us start by briefly recalling the three 
forerunners (Table 3). Euclidean geometry underlies 
Newton's mechanics as a geometry in the space of 
positions. Forces are described by vectors living in 
the same space and the Euclidean scalar product is 
needed to define work and potential energy. Time 
is not part of geometry — it is absolute. This point 
of view is abandoned in special relativity unifying 
space and time into Minkowskian geometry. This 
new point of view allows to derive the magnetic 


Table 3 Four nested analytic geometries 


Geometry Force Time 
Euclidean E= [F-dx Absolute 
Minkowskian E,69 > B,yo=e'c* Universal 
Riemannian Coriolis — gravity Proper, 7 
Noncommutative Gravity > YMH, A— igó . Ar-—10 9s 
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field from the electric field as a pseudoforce 
associated with a Lorentz boost. Although time 
becomes relative, one can still imagine a grid of 
synchronized clocks, that is, a universal time. The 
next generalization is “Riemannian geome- 
try —curved spacetime.” Here gravity can be 
viewed as the pseudoforce associated with a 
uniformly accelerated coordinate transformation. 
At the same time, universal time loses all meaning 
and we must content ourselves with proper time. 
With today's precision in time measurement, this 
complication of life becomes a bare necessity, for 
example, the global positioning system (GPS). 
Our last generalization is “noncommutative 
geometry = curved space(time) with an uncertainty 
principle." As in quantum mechanics, this uncertainty 
principle is introduced via noncommutativity. 


Quantum Mechanics 


Consider the classical harmonic oscillator. Its phase 
space is R^ with points labeled by position x and 
momentum p. A classical observable is a differenti- 
able function on phase space such as the total energy 
p^ /(2m) + kx*. Observables can be added and multi- 
plied, and they form the algebra C* (R7), which is 
associative and commutative. To pass to quantum 
mechanics, this algebra is rendered noncommutative 
by means of a noncommutation relation for the 
generators x and p:[x,p| — i51. Let us call A the 
resulting algebra *of quantum observables." It is still 
associative, and has an involution -* (the adjoint or 
Hermitian conjugation) and a unit 1. 

Of course, there is no space anymore of which A is 
the algebra of functions. Nevertheless, we talk about 
such a “quantum phase space” as a space that has no 
points or a space with an uncertainty relation. Indeed, 
the noncommutation relation implies Heisenberg’s 
uncertainty relation AxAp >h/2 and tells us that 
points in phase space lose all meaning; we can only 
resolve cells in phase space of volume 5/2, see Figure 1. 
To define the uncertainty Aa for an observable a € .A, 
we need a faithful representation of the algebra on a 
Hilbert space, that is, an injective homomorphism p 
from A into the algebra of operators on H. For the 
harmonic oscillator, this Hilbert space is H = £^(R). 
Its elements are the wave functions w(x), square- 
integrable functions on configuration space. Finally, 
the dynamics is defined by the Hamiltonian, a self- 
adjoint observable H — H* € A via Schródinger's 
equation (ihO/Ot — p(H))v(t, x) — 0. Here time is an 
external parameter; in particular, time is not an 
observable. This is different in the special-relativistic 
setting, where Schródinger's equation is replaced by 
Dirac's equation Jy — 0. Now the wave function :/ is 


e] h/2 


X 


Figure 1 The first example of noncommutative geometry. 


the four-component spinor consisting of left- and right- 
handed, particle and antiparticle wave functions. 
Unlike the Hamiltonian, the Dirac operator does not 
lie in A, but it is still an operator on H. In Euclidean 
spacetime, the Dirac operator is also self-adjoint, 


T=9. 
Spectral Triples 


Noncommutative geometry (Connes 1994, 1995) 
does to a compact Riemannian spin manifold M 
what quantum mechanics does to phase space. A 
noncommutative geometry is defined by the three 
purely algebraic items (.4,71, 9), called a spectral 
triple. .A is a real, associative, and possibly non- 
commutative involution algebra with unit, faithfully 
represented on a complex Hilbert space H, and 4 is 
a self-adjoint operator on H. As the spectral triple, 
also the axioms linking its three items are motivated 
by relativistic quantum mechanics. 

When .A— C^ (M), the functions on a Riemannian 
spin manifold M, represented on spinors w, and d is 
the gravitational Dirac operator, one has a spectral 
triple. The converse is also true when .A is a 
suitable commutative algebra (Connes 1996), but 
the axioms make sense even when .A is not 
commutative. As for quantum phase space, Connes 
defines a noncommutative geometry by a spectral 
triple whose algebra is allowed to be noncommu- 
tative and he shows how important properties like 
dimensions, distances, differentiation, integration, 
general coordinate transformations, and direct 
products generalize to the noncommutative setting. 
As a bonus, the algebraic axioms of a spectral 
triple, commutative or not, include discrete, that is, 
zero-dimensional spaces that now are naturally 
equipped with a differential calculus. These spaces 
have finite-dimensional algebras and Hilbert 
spaces, meaning that their algebras are just matrix 
algebras. 

An *almost commutative geometry" is defined as a 
direct product of a four-dimensional commutative 
geometry, “ordinary spacetime,” by a zero-dimensional 
noncommutative geometry, the “internal space.” If the 
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latter is also commutative, for example, the ordinary 
two-point space, then the direct product describes a 
two-sheeted universe or a Kaluza-Klein space whose 
fifth dimension is discrete, (Madone 1995). In general, 
the axioms of spectral triples imply that the Dirac 
operator of the internal space is precisely the fermionic 
mass matrix. 

As a generic example, here is the internal spectral 
triple underlying the standard model with one 
generation of quarks and leptons. The algebra 
A-—H &$ C M3(C) 3 (a, b, c) contains quaternions, 
that is, 2 x 2 matrices of the form 


_(* 3 " 
i n x,yEC 


complex numbers b and complex 3 x 3 matrices c. 
The Hilbert space is 30-dimensional, where we 
count particles and antiparticles (-*) separately: 
H=H. o6 DHE os 2C o C’ e C* o C". The 
representation is block-diagonal, with the four 


blocks 


agl; 0 
pi (a) := 
0 a 
bi; 0 0 [12] 
pr(b):= | 0 bl; O 
0 0 b 
, 1, ®c 0 
pr (b,c) = ( ð iis) 
c 0 0 [13] 
pp(b,c) = | 0 0 
0 


The internal Dirac operator (=fermionic mass 
matrix) contains two quark masses 77,,71; and one 
lepton mass m,, and no mixing: 


0 M O0 Q 
M 0 0 0 
D » 
0 0 0 M 
0 0 M 0 
[14] 
P 0 ) 23 0 
Mic 0 mg ; 


These matrices look rather ad hoc; they are not. 
They define an irreducible spectral triple and, for a 
given algebra, there is only a finite number of such 
triples. 


The Spectral Action 


Chamseddine and Connes (1997) generalize general 
relativity to noncommutative spacetimes in two 
strokes, kinematics and dynamics. They explicitly 
compute this generalization for almost commutative 
geometries. 

Kinematics In noncommutative geometry, gen- 
eral coordinate transformations are algebra auto- 
morphisms lifted to the Hilbert space of spinors. 
For almost commutative geometries, these transfor- 
mations are precisely general coordinate trans- 
formations of ordinary spacetime and gauge 
transformations. Now remember how Einstein uses 
the equivalence principle to produce “gravity = 
curvature” starting from the flat metric, which in 
Connes’ language is the ordinary flat Dirac opera- 
tor. When applied to an almost commutative 
geometry (Connes 1996), the equivalence principle 
produces again a curved metric via the ordinary 
coordinate transformations on M, while the gauge 
transformations applied to the fermionic mass 
matrix produce a new field, the Higgs scalar y. For 
the example above, this field is precisely the isospin 
doublet, color singlet with hypercharge —1/2 of eqn 
[6]. Gauge transformations also apply to the 
ordinary Dirac operator, thereby producing the 
gauge fields A. 

Dynamics The group of generalized coordinate 
transformations allowed us to construct the con- 
figuration space. In the almost commutative case it 
consists of Riemannian metrics, gauge fields, and 
Higgs scalars. We now want a dynamics on this 
configuration space. Of course, we want this 
dynamics to be invariant under the group of 
generalized coordinate transformations. Note that 
the spectrum of the Dirac operator is invariant 
under this group and Chamseddine and Connes 
(1997) define the spectral action as a regularized 
partition function of these eigenvalues. 

On almost commutative geometries, the spectral 
action is equal to the Einstein-Hilbert action plus 
the Yang-Mills-Higgs ansatz (Figure 2). In other 
words, almost commutative geometry explains the 
forces mediated by gauge bosons and Higgs scalars 
as pseudoforces accompanying the gravitational 
force in the same way that Minkowskian geometry 
(i.e., special relativity) explains the magnetic force as 
a pseudoforce accompanying the electric force. 
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Noncommutative geometry 


?? 


Connes 
Almost commutative Connes Gravity + Yang—Mills—Higgs 
geometry ansatz + constraints 
l i l Einstein 
Riemannian geometry — Gravity 
Figure 2 Deriving the Yang-Mills-Higgs ansatz from gravity. 
Yang-Mills-Higgs 
Left-right symmetry 
1.4 935 
1.2 
1 
0.8 go 
0.6 
0.4 
T e" 0.2; (877 
Standard model plica: = 
Haee E m, 109 GeV A 


Figure 3 Constraints inside the ansatz. 


There are constraints on the discrete and contin- 
uous parameters in the Yang-Mills-Higgs ansatz 
deriving from the spectral action Figure 3. 

In particular, if we consider only irreducible spectral 
triples and among them only those which produce 
nondegenerate fermion masses compatible with renor- 
malization, then we only get the standard model with 
one generation of quarks and leptons, with a massless 
neutrino and with an arbitrary number of colors, anda 
few submodels thereof. More than one generation and 
neutrino masses are possible but imply reducible 
triples. However, in at least one generation, the 
neutrino must remain purely left and massless. 

For the standard model with N generations 


and XN, colors, we have the constraints 
£x. — £5; — (9/N)A on the continuous parameters. If 
we put N— N. —3 and if we believe in the popular 


“big desert" then these constraints yield a “unifica- 
tion scale" A — 10" GeV at which the uncertainty 
relation in spacetime should become manifest, 
Ar=h/A, and a Higgs mass of m,=171.6+ 
5 GeV for m, = 174.3 + 5.1 GeV (see Figure 4). 

It is clear that almost commutative geometries 
only scratch the surface of a gold mine. May we 
hope that a genuinely noncommutative geometry 
will solve our present problems with quantum field 
theory and quantum gravity? 


Figure 4 Running coupling constants. 


See also: Compact Groups and Their Representations; 
Dirac Fields in Gravitation and Nonabelian Gauge 
Theory; Effective Field Theories; General Relativity: 
Overview; Hopf Algebras and q-Deformation Quantum 
Groups; Positive Maps on C*-Algebras; Quantum Hall 
Effect; Standard Model of Particle Physics; Symmetries 
and Conservation Laws; Symmetry Breaking in Field 
Theory; von Neumann Algebras: Introduction, Modular 
Theory, and Classification Theory. 
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Noncommutative Geometry from 
String Theory 


The first use of noncommutative geometry in string 
theory appears in the work of Witten on open-string 
field theory where the noncommutativity is asso- 
ciated with the product of open-string fields. 
Noncommutative geometry appears in the recent 
development of string theory in the seminal work of 
Connes, Douglas, and Schwarz where they con- 
structed and identified the compactification of 
Matrix theory on a noncommutative torus. 


Matrix Theory Compactification and 
Noncommutative Geometry 


The matrix theory (M-theory) is an 11-dimensional 
quantum theory of gravity which is believed to 
underlie all superstring theories. Banks, Fischler, 
Shenker, and Susskind proposed that the large N 
limit of the supersymmetric matrix quantum 
mechanics of N DO-branes should describe the 
M-theory compactified on a  lightlike circle. 
Compactification of the M-theory on a torus can be 
easily achieved by considering the torus as the quotient 
space R4/ZZ with the quotient conditions 


U;!XU;—- X -8&2zR, i=1,...,d [i 


Here R; are the radii of the torus. The unitary 
translation generators U; generate the torus. They 
satisfy U;U;— U;U;. T-dualizing the DO brane 
system, eqn [1] leads to the dual description as a 
(d+ 1)-dimensional supersymmetric gauge theory 
on the dual toroidal D-brane. A noncommutative 
torus T is defined by the modified relations 


U; U; = ei U; U; [2] 


where 0; specify the noncommutativity. Compacti- 
fication on a noncommutative torus can be easily 


accommodated and leads to noncommutative gauge 
theory on the dual D-brane. The parameters 6;; can 
be identified with the components C. j; of the 3-form 
potential in M-theory. 

Since M-theory compactified on a circle leads to 
IIA string theory, the components C. correspond to 
the Neveu-Schwarz (NS) B-field Bj; in HA string 
theory. The physics of the DO brane system in the 
presence of an NS B-field can also be studied from 
the viewpoint of IIA string theory. This led Douglas 
and Hull to obtain the same result that a non- 
commutative field theory lives on the D-brane. 
Toroidally compactified IIA string theory has a 
T-duality group SO(d, d; Z). The T-duality symmetry 
gets translated into an equivalence relation between 
gauge theories on the noncommutative torus: a gauge 
theory on the noncommutative torus TZ is equivalent 
to that on the noncommutative torus T4 if their 
noncommutativity parameters and metrics are related 
by a T-duality transformation. For example, 


0' = (A0 + B(C0 + D), 


A B 
e a € SO(d, d; Z) [3] 
It is remarkable that the T-duality acts within the 
field theory level, rather than mixing up the field 
theory modes with the string winding states and 
other stringy excitations. Mathematically, eqn [3] is 
precisely the condition for the noncommutative tori 
T4 and TZ to be Morita equivalent. 


Open-String in B-Field 


It was soon realized that the D-brane does not 
necessarily need to be toroidal in order to be 
noncommutative. A direct canonical quantization 
of the open-string system shows that a constant 
B-field on a D-brane leads to noncommutative 
geometry on the D-brane world volume. Consider 
an open string moving in a flat space with metric g; 
and a constant NS B-field. In the presence of a Dp 
brane, the components of the B-field not along the 
brane can be gauged away; thus, the B-field can 
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have effects only in the longitudinal directions along 
the brane. The world-sheet (bosonic) action for this 
part is 


1 2 
s= Jd " 


x Cue — Zro! Bie Q,x Op | [4] 


where i,/—0,1,...,p is along the brane. It is easy 
to see that the boundary condition g;0,x! + 
2zio/ B;O,x! =0 at c—0,7 is not compatible with 
the standard canonical quantization  [x'(7, c), 
x!(r,0’)]=0 at the boundary. Taking the boundary 
condition as constraints and performing canonical 


quantization, one obtains the commutation 
relations 
[25,7] = MG S, im oa Po] = iG", 
i aji — gi 
[5,39] = i8 [5] 


Here, the open-string mode expansion is 
xir a) =x}, + 2o'(phr — 2no'(g ! B)ipyo) 
VEY 


nz 


—1nT 
e 


n 
x (ia, cos no — 2ra’ (g ! B);a, sin no) 


G" and 0" are the symmetric and antisymmetric 
parts of the matrix (g + 2zo/ B) ": 


= \g+2ra'B d g — 2ra'B 


1 1 ` 
[ll Lem aa Bamm 
ud a g— 7) 


6 


It follows from [5] that the boundary coordinates 
x = x'(r, 0) obey the commutation relation 


[x', x] = i9" [7] 


Relation [7] implies that the D-brane world volume, 
where the open-string endpoints live, is a noncom- 
mutative manifold. One may also start with the 
closed-string Green function and let its arguments to 
approach the boundary to obtain the open-string 
Green function 


(x! (T)x! (7^) = —q'G! In(r — ry” -+ ; fer —T) [8] 


where e(r) is the sign of 7. From [8], one can 
again extract the commutator [7]. Gj =g; — (27a P 
(Bg B); is called the open-string metric since it controls 
the short-distance behavior of open strings. In contrast, 
the short-distance behavior for closed strings is con- 
trolled by the closed-string metric g;. One may also treat 


the boundary B-term in [4] as a perturbation to the 
open-string conformal field theory and from which one 
may extract [8] from the modified operator product 
expansion of the open-string vertex operators. 

D-branes in the Wess-Zumino-Witten model 
provide another example of noncommutative geo- 
metry. In this case, the background is not flat since 
there is a nonzero H—dB ~ k !/"?^, where k is the 
level. Examining the vertex operator algebra, one 
obtains that D-branes are described by nonassocia- 
tive deformations of fuzzy spheres with nonassocia- 
tivity controlled by 1/5. 


String Amplitudes and Effective Action 


The effect of the B-field on the open-string ampli- 
tudes is simple to determine since only the xj 
commutation relation is affected nontrivially. For 
example, the noncommutative gauge theory can be 
obtained from the tree-level string amplitudes read- 
ily. For tree and one loop, the vertex operator 
formalism can be used. Generally, the vertex 
operator can be inserted at either the 6 —0 or o—7 
boundary, where the string has zero mode parts x} 
and yi = xt — (27a')*(g B)'p), respectively. The 
commutation relations are 


bii). — pl] =0, 
i aJi igi 

Yo o] = iy [9] 
The difference in the commutation relation for xo 
and yo implies that the two boundaries of the open 
string have opposite commutativity. This fact is not 
so important for tree-level calculations since one can 
always choose to put all the interactions at, for 
example, the c— 0 boundary. Collecting all these 


zero mode parts of the vertex operators, one obtains 
a phase factor 


- N 
gib xo cip xo eip™ xo = e! 2 ,P'xoe (i/2) M ona p! dp! [10] 


where the external momenta p° are ordered cycli- 
cally on the circle, and momentum conservation has 
been used. The computation of the oscillator part of 
the amplitude is the same as in the B = 0 case, except 
that the metric G is employed in the contractions. 
As a result, the effect of the B-field on the tree-level 
string amplitude is simply to multiply the amplitude 
at B=O by the phase factor and to replace the 
metric by the metric G. A generic term in the tree- 
level effective action simply becomes 


=$ [ 4s — det GtrO" $, «.-.«x 0" do, — [11] 


Here the star product, also called the Moyal 
product, is defined by 


(f * g)(x) 
OY 9 8 
= exp CT) f (x1)g(x2)l4, x5 [12] 


The star product is associative and noncommutative, 
and satisfies f x g — g * f under complex conjugation. 
Also, for functions that vanish rapidly enough at 
infinity, there holds 


] (5s [ est- [i 13] 


An interesting consequence of the nonlocality 
as expressed by the noncommutative geometry [7] 
is the existence of a dipole excitation whose extent is 
proportional to its momentum, Ax — &0. This rela- 
tion is at the heart of the “IR/UV mixing phenom- 
enon” (see below) of noncommutative field theory. 

At one- (and higher-) loop level, the different 
noncommutativities for the opposite boundaries of 
the open string become essential and give rise to new 
effects. In this case nonplanar diagrams require one 
to put vertex operators at the two different 
boundaries o=0,7. A more complicated phase 
factor, which involves internal as well as external 
momentum, results. This leads to IR/UV mixing in 
the noncommutative quantum field theory. The 
different noncommutativity for the opposite bound- 
aries of the open string [9] is the basic reason for the 
IR/UV mixing in the noncommutative quantum field 
theory. The commutation relations [5] are valid at all 
loops; therefore, one can use them to construct the 
higher-loop string amplitudes from first principles. 
The effect of the B-field on the string interaction can 
easily be implemented into the Reggeon vertex and 
the complete higher loop amplitudes in the presence 
of the B-field have been constructed. 


Low-Energy Limit - The Seiberg-Witten Limit 
and the NCOS Limit 


The full open-string system is still quite complicated. 
One may try to decouple the infinite number of 
massive string modes to obtain a low-energy field- 
theoretic description by taking the limit a’ — 0. 
Since open strings are sensitive to G and 0, one 
should take the limit such that G and @ are fixed. 
For the magnetic case Bo; =0, Seiberg and Witten 
showed that this can be achieved with the following 
double scaling limit: 


al e eU? gij€0 [14] 
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with B; and everything else kept fixed. Assuming B 
is of rank r, then [6] becomes 


Gy = —(2ra') (Bg B), 67 =(B"')’, 
for if ='l,.....7 [15] 


Otherwise G;; = gj, 0" =0. One may also argue that 
the closed string decouples in this limit. As a result, 
in the low-energy limit a greatly simplified non- 
commutative Yang-Mills action F* F is obtained 
(see below for more discussion of this field theory). 

For the case of a constant electric field back- 
ground, say Bo, Z 0, there is a critical electric field 
beyond which the open string becomes unstable and 
the theory does not make sense. Due to the presence 
of this upper bound of the electric field, one can 
show that there is no decoupling limit where one 
can reduce the string theory to a field theory on a 
noncommutative spacetime. However, one can con- 
sider a different scaling limit where one takes the 
closed-string metric scale to infinity appropriately as 
the electric field approaches the critical value. In this 
limit, all closed-string modes decouple. One obtains 
a novel noncritical string theory living on a 
noncommutative spacetime known as the noncom- 
mutative open string (NCOS). 


Noncommutative Quantum Field Theory 


Field theories on noncommutative spacetime are 
defined by using the star product instead of the 
ordinary product of the fields. To illustrate the 
general ideas, let us consider a single real scalar field 
theory with the action 


2 
s= [9x 50,6400 - "ovo - VO) 


vie) - £9" [16 
Due to the property [13], free noncommutative field 
theory is the same as an ordinary field theory. 
Treating the interaction term as a perturbation, one 
can perform the usual quantization and obtain the 
Feynman rules: the propagator is unchanged and the 
interaction vertex in the momentum space is given 
by g times the phase factor 


exp| -5 Y. p'xp [17] 
2 ack 


Here p x q = p,0""q,. The theory is nonlocal due to 
the infinite order of derivatives that appear in the 
interaction. 
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Planar and Nonplanar Diagrams 


The factor [17] is cyclically symmetric but not 
permutation symmetric. This is analogous to the 
situation of an M-field theory. Using the same 
double-line notation as introduced by 't Hooft, one 
can similarly classify the Feynman diagrams of 
noncommutative field theory according to its 
genus. In particular, the total phase factor of a 
planar diagram behaves quite differently from that 
of a nonplanar diagram. It is easy to show that a 
planar diagram will have the phase factor 


>. 22 [18] 


l<a<b<n 


where p!,...,p" are the (cyclically ordered) external 
momenta of the graph. Note that the phase factor 
[18] is independent of the internal momenta. This is 
not the case for a nonplanar diagram. One can easily 
show that a nonplanar diagram carries an additional 
phase factor 


5 Gupta!) [19] 


l<a<b<n 


Vap = Vp exp E 


where C,, is the signed intersection matrix of the 
graph, whose ab matrix element counts the number 
of times the ath (internal or external) line crosses the 
bth line. The matrix C,, is not uniquely determined 
by the diagram as different ways of drawing the 
graph could lead to different intersections. However, 
the phase factor [19] is unique due to momentum 
conservation. 

The different behaviors of the planar and non- 
planar phase factors have important consequences. 


1. Since the phase factor [18] is independent of the 
internal momenta, the divergences and renorma- 
lizability of the planar diagrams will be (simply) 
the same as in the commutative theory and can 
be handled with standard renormalization tech- 
niques. This is sharply different for the nonplanar 
diagrams. In fact, due to the extra oscillatory 
internal-momenta-dependent phase factor, one 
can expect the nonplanar diagrams to have an 
improved ultraviolet (UV) behavior. It turns out 
that planar and nonplanar diagrams also differ 
sharply in their infrared (IR) behavior due to the 
“TR/UV mixing effect” (see below). 

2. Moreover, at high energies one can expect that 
noncommutative field theory will generically 
become planar since the nonplanar diagrams will 
be suppressed due to the oscillatory phase factor. 

3. In the limit 0 — oc, the nonplanar sector will be 
totally suppressed since the rapidly oscillating 


phase factor will cause the nonplanar diagram to 
vanish upon integrating out the momenta. Thus, 
generically the large 0 limit is analogous to the 
large-N limit where only the planar diagrams 
contribute. However, these expectations do not 
apply for noncommutative gauge theory since 
one needs to include *open Wilson lines" (see 
below) in the construction of gauge invariant 
observables, and the open Wilson line grows in 
extent with energy and 0. 


IR/UV Mixing 


Due to the nonlocal nature of noncommutative field 
theory, there is generally a mixing of the UV and 
IR scales. The reason is roughly the following. 
Nonplanar diagrams generally have phase factors 
like exp (ikp) with k a loop momentum, p an 
external momentum. Consider a nonplanar diagram 
which is UV divergent when 0 — 0; one can expect 
that for very high loop momenta the phase factor 
will oscillate rapidly and render the integral finite. 
However, this is only valid for a nonvanishing 
external momentum 6p; the infinity will come back 
as Op — 0. However, this time it appears as an IR 
singularity. Thus, an IR divergence arises whose 
origin is from the UV region of the momentum 
integration and this is known as the IR/UV mixing 
phenomenon. 

To be more specific, consider the $* scalar theory 
in D —4 dimensions. The one-loop self-energy has a 
nonplanar contribution given by 


r =£ [aut 
" 6Qz)J k +m 3(4n2)° 
x (Meg + °°*) [20] 


where AS. DAC + (8p) y !. One can see clearly 
the IR/UV mixing: [yp is UV finite as long as 0p 7 0; 
when 60p-—0, the quadratic UV divergence is 
recovered, [yp A^. For supersymmetric theory, 
one has at most logarithmic IR singularities from 
IR/UV mixing. 

IR/UV mixing has a number of interesting 
consequences. 


1. Due to the IR/UV mixing, noncommutative 
theory does not appear to have a consistent 
Wilsonian description since it requires that 
correlation functions computed at finite A differ 
from their limiting values by terms of order 1/A 
for all values of momenta. However, this is not 
true for theory with IR/UV mixing. For example, 
the two-point function [20] at finite value of A 
differs from its value at A —oo by the amount 


ap 5 -T 8 mel 0p)”, for the range of momenta 
(0p) « 1 /A?. It has been argued that the IR 
singularity may be associated with missing light 
degrees of freedom in the theory. With new 
degrees of freedom appropriately added, one may 
recover a conventional Wilsonian description. 
Moreover, it has been suggested to identify 
these degrees of freedom with the closed-string 
modes. However, the precise nature and origin of 
these degrees of freedom is not known. 

2. The renormalization of the planar diagrams is 
straightforward; however, the situation is more 
subtle for the nonplanar diagrams since the IR/ 
UV-mixed IR singularities may mix with other 
divergences at higher loops and render the proof 
of renormalizability much more difficult. IR/UV 
mixing renders certain large N noncommutative 
field theory nonrenormalizable. However, for 
theories with a fixed set of degrees of freedom 
to start with, it is believed that one can have 
sufficiently good control of the IR divergences 
and prove renormalizability. An example of 
renormalizable noncommutative quantum field 
theory is the noncommutative Wess-Zumino 
model where IR/UV mixing is absent. However, 
a general proof is still lacking. 

3. One can show that IR/UV mixing in timelike 
noncommutative theory (0? 4 0) leads to break- 
down of perturbative unitarity. For a theory 
without IR/UV mixing, unitarity will be respected 
even if the theory has a timelike noncommuta- 
tivity. Theory with lightlike noncommutativity is 
unitary. 


Noncommutative Gauge Theory 


Gauge theory on noncommutative space is defined 
by the action 


E hf aesti "IET 


where the gauge fields A; are N x N Hermitian 
matrices, F; is the noncommutative field strength 
F; = 0;A; — O;A; — i[A;, Aj],, and tr is the ordinary 
trace over N x N matrices. The theory is invariant 
under the star-gauge transformation 


A; —n g* A; * g! — ig * Og! [22] 


where the N x N matrix function £x) is unitary 
with respect to the gat product g g'=g'+ g=]. 
The solution is g—e^, where A is Hermitian. In 
infinitesimal form, 6,A;=0;A + ilà, A;],. The non- 
commutative gauge theory has N? Hermitian gauge 
fields. Because of the star product, the U(1) sector of 


* F!(x)) [21] 
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the theory is not free and does not decouple from 
the SU(N) factor as in the commutative case. Note 
that this way of defining noncommutative gauge 
theory does not work for other Lie groups since the 
star commutator generally involves the commutator 
as well as the anticommutator of the Lie algebra; 
hence, the expressions above generally involve the 
enveloping algebra of the underlying Lie group. 
With the help of the “Seiberg-Witten map" (see 
below), one can construct an enveloping-algebra- 
valued gauge theory which has the same number of 
independent gauge fields and gauge parameters as 
the ordinary Lie-algebra-valued gauge theory. How- 
ever, the quantum properties of these theories are 
much less understood. One may also introduce 
certain automorphisms in the noncommutative 
U(N) theory to restrict the dependence of the 
noncommutative space coordinates of the field 
configurations and obtain a notion of noncommu- 
tative theory with orthogonal and symplectic star- 
gauge group. However, the theory does not reduce 
to the standard gauge theory in the commutative 
limit 0 — 0. 


Open Wilson Line and Gauge-Invariant 
Observables 


One remarkable feature of noncommutative gauge 
theory is the mixing of noncommutative gauge 
transformations and spacetime translations, as can 
be seen from the following identity: 


Me f (x) * e x — f (x + k0) [23] 


for any function f. This is analogous to the situation 
in general relativity where translations are also 
equivalent to gauge transformations (general coor- 
dinate transformations). Thus, as in general relativ- 
ity, there are no local gauge-invariant observables in 
noncommutative gauge theory. The unification of 
spacetime and gauge fields in noncommutative 
gauge theory can also be seen from the fact 
that derivatives can be realized as commutators, 
Of — —i[;,'x’, f ], and get absorbed into the vector 
potential in the covariant derivative 


D; = 0; + iA;  —i6;! x! + iA; [24] 


Equation [24] clearly demonstrates the unification 
of spacetime and gauge fields. Note that the field 
strength takes the form Fj = i[D;, Dj] + 6. 

The Wilson line operator for a path C running 
from x; to x2 is defined by 


W(C) =P, exp i f A) [2.5] 
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P, denotes the path ordering with respect to the star 
product, with A(x2) at the right. It transforms as 


W(C) = g(x1) * W(C) * g(xa)' [26] 


In commutative gauge theory, the Wilson line 
operator for closed loop (or its Fourier transform) 
is gauge invariant. In noncommutative gauge 
theory, the closed Wilson loops are no longer 
gauge invariant. Noncommutative generalization 
of the gauge invariant Wilson loop operator can 
be constructed most readily by deforming the 
Fourier transform of the Wilson loop operator. It 
turns out that the closed loop has to open in a 
specific way to form an open Wilson line in order 
to be gauge invariant. To see this, let us consider a 
path C connecting points x and x + l. Using [23], it 
is easy to see that the operator 


W(k) = J dxtr W(C) «e**, with / = k;9? [27] 
is gauge invariant. Just like Wilson loops in ordinary 
gauge theory, these operators also constitute an 
overcomplete set of gauge-invariant operators para- 
metrized by the set of curves C. When 0—0,C 
becomes a closed loop and we reobtain the (Fourier 
transformed) usual closed Wilson loop in commu- 
tative gauge theory. Noncommutative version of the 
loop equation for closed Wilson loop has been 
constructed and involves open Wilson line. The 
open Wilson line is instrumental in the construction 
of gauge-invariant observables. An important appli- 
cation is in the construction of various couplings of 
the noncommutative D-brane to the bulk super- 
gravity fields. The equivalence of the commutative 
and noncommutative couplings to the RR fields 
leads to the exact expression for the Seiberg-Witten 
map. It is remarkable that the one-loop nonplanar 
effective action for noncommutative scalar theory, 
gauge theory, as well as the two-loop effective 
action for scalar can be written compactly in terms 
of open Wilson line. Based on this result, the 
physical origin of the IR/UV mixing has been 
elucidated. One may identify the open Wilson line 
with the dipole excitation generically presents in 
noncommutative field theory and hence explain the 
presence of the IR/UV mixing. IR/UV mixing may 
also be identified with the instability associated with 
the closed-string exchange of the noncommutative 
D-branes. 


The Seiberg-Witten Map 


The open string is coupled to the 1-form A; living on 
the D-brane through the coupling fay A. For slowly 
varying fields, the effective action for this gauge 


potential can be determined from the S-matrix and 
is given by the Dirac-Born-Infeld (DBI) action. In 
the presence of a B-field, the discussion above (see 
eqn [11]) leads to the noncommutative DBI 
Lagrangian 


Lxcpsai(F) = G; bp = det(G +- 2ra F) [28] 
where [lp = (2x) ? (o/) 9 */? is the D-brane tension 
and F is the noncommutative field strength. 
However, one may also exploit the tensor gauge 
invariance on the D-brane (i.e., the string sigma 
model is invariant under A — A — A, B — B + dA) 
and consider the combination F + B as a whole. In 
this case, it is like having the open string coupled 
to the boundary gauge .field strength F-- B and 
there is no B field. One has the usual DBI 
Lagrangian 


Lpgi(F) = E'it — det(G + 2ra'(F + B)) [29] 


In [28] and [29], G, and g, are the effective open- 
string couplings in the noncommutative and com- 
mutative descriptions. Although they look quite 
different, Seiberg and Witten showed that the 
commutative and noncommutative DBI actions 
are indeed equivalent if the open-string couplings 
are related by g; = G, det (g + 2ra'B)/detG and 
there is a field redefinition that relates the 
commutative and noncommutative gauge fields. 
The map A=A(A) is called the Seiberg-Witten 
map. Moreover, the noncommutative gauge sym- 
metry is equivalent to the ordinary gauge symme- 
try in the sense that they have the same set of 
orbits under gauge transformation: 


A(A) + 6;A(A) = A(A + 6,A) 30] 
Here À; and X are, respectively, the noncommu- 
tative gauge field and noncommutative gauge 
transformation parameter, and A; and A are, 
respectively, the ordinary gauge field and ordinary 
transformation parameter. The map between Å; 
and A; is called the Seiberg-Witten map. Equation 
[30] can be solved only if the transformation 
parameter A=A(\,A) is field dependent. The 
Seiberg—Witten map is characterized by the Seiberg— 
Witten differential equation 


6A;(0) — 160"! | Ay * (A; + £y) 
+(0)Ai + Fi) « Ag] [31] 


An exact solution for the Seiberg-Witten map can 
be written down with the help of the open Wilson 


line. For the case of U(1) with constant F, we have 
the exact solution F= (14- F0) 'F. 

That there is a field redefinition that allows one to 
write the effective action in terms of different fields 
with different gauge symmetries may seem puzzling 
at first sight. However, it has a clear physical origin 
in terms of the string world sheet. In fact, there are 
different possible schemes to regularize the short- 
distance divergence on the world sheet. One can 
show that the Pauli-Villars regularization gives the 
commutative description, while the point-splitting 
regularization gives the noncommutative descrip- 
tion. Since theories defined by different regulariza- 
tion schemes are related by a coupling-constant 
redefinition, this implies that the commutative and 
noncommutative descriptions are related by a field 
redefinition, because the couplings on the world 
sheet are just the spacetime fields. 

Despite this formal equivalence, the physics of the 
noncommutative theories is generally quite different 
from the commutative case. First, it is clear that 
generally the Seiberg- Witten map may take non- 
singular configurations to singular configurations. 
Second, the observables one is interested in are also 
generally different. Moreover, the two descriptions 
are generally good for different regimes: the con- 
ventional gauge theory description is simpler for 
small B and the noncommutative description is 
simpler for large B. 


Perturbative Gauge Theory Dynamics 


The noncommutative gauge symmetry |22] can be 
fixed as usual by employing the Faddeev-Popov 
procedure, resulting in Feynman rules that are 
similar to the conventional gauge theory. The 
important difference is that now the structure 
constants in the phase factors [18] and [19] should 
be amended. It turns out that the nonplanar U(N) 
diagrams contribute (only) to the U(1) part of the 
theory. As a result, unlike the commutative case, the 
U(1) part of the theory is no longer decoupled and 
free. Noncommutative gauge theory is one-loop 
renormalizable. The 5-function is determined solely 
by the planar diagrams and, at one loop, is given by 


22 Ng? 
3 167? 


Note that the 5-function is independent of 6; the 
noncommutative U(1) is asymptotically free and 
does not reduce to the commutative theory when 
0 — 0. Noncommutative theory beyond the tree 
level is generally not smooth in the limit 0 — O. 
Discontinuity of this kind was also noted for the 
Chern-Simon system. 


Big) = for N > 1 [32] 
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Gauge anomalies can be similarly discussed and 
satisfy the noncommutative generalizations of the 
Wess-Zumino consistency conditions. In d —2z 
dimensions, the anomaly involves the combination 
tr(T^ T^? ... T^!) rather than the usual symme- 
trized trace, since the phase factor is not permutation 
symmetric. As a result, the usual cancellation of the 
anomaly does not work and is the main obstacle to 
the construction of noncommutative chiral gauge 
theory. 

There are a number of interesting features to 
mention for the IR/UV mixing in noncommutative 
gauge theory. 


1. IR/UV mixing generically yields pole-like IR 
singularities. Despite the appearance of IR 
poles, gauge invariance of the theory is not 
endangered. 

2. One can show that only the U(1) sector is 
affected by IR/UV mixing. 

3. As a result of IR/UV mixing, noncommutative 
U(1) photons polarized in the noncommutative 
plane will have different dispersion relations 
from those which are not. Strange as it is, this 
is consistent with gauge invariance. 


Noncommutative Solitons, Instantons 
and D-Branes 


Solitons and instantons play important roles in the 
nonperturbative aspects of field theory. The non- 
locality of the star product gives noncommutative 
field theory a stringy nature. It is remarkable that 
this applies to the nonperturbative sector as well. 
Solitons and instantons in the noncommutative 
gauge theory amazingly reproduce the properties of 
D-branes in the string. 


GMS Solitons 


Derrick’s theorem says that commutative scalar field 
theories in two or higher dimensions do not admit 
any finite-energy classical solution. This follows 
from a simple scaling argument, which will fail 
when the theory becomes noncommutative since 
noncommutativity introduces a fixed length scale 
v0. Noncommutative solitons in pure scalar theory 
can be easily constructed in the limit 0— oc. For 
example, consider a (2 + 1)-dimensional single sca- 
lar theory with a potential V and noncommutativity 
0—60. In the limit 0—oc, the potential term 
dominates and the noncommutative solitons are 
determined by the equation 


OV/0¢ = 0 [33] 
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Equation [33] can be easily solved in terms of 
projectors. Assuming V has no linear term, the 
general soliton (up to unitary equivalence) is 


fet MB [34] 


where A; are the roots of V'(A) — 0 and P; is a set of 
orthogonal projectors. For real scalar field theory, 
the sum is restricted to real roots only. These 
solutions are known as the Gopakumar-Minwalla- 
Strominger (GMS) solitons. A simple example of a 
projector is given by P=|0)(0|, which corresponds 
to a Gaussian profile in the x!, x^ plane with width 
V6. The soliton continues to exist until @ decreases 
below a certain critical 6. 

New solutions can be generated from known ones 
using the so-called solution-generating technique. If 
@ is a solution of [33], then 


¢' — T'¢T [35] 


is also a solution provided that TT! —1. In an 
infinite-dimensional Hilbert space, T is not necessa- 
rily unitary, that is, T'T Æ 1. In this case, T is said 
to be a partial isometry. The new solution ó' is 
different from @ since they are not related by a 
global transformation of basis. 


Tachyon Condensation and D-Branes 


A beautiful application of the noncommutative 
soliton is in the construction of D-branes as solitons 
of the tachyon field in noncommutative open-string 
theory. For the bosonic string theory, one may 
consider it to be a space-filling D25 brane. Integrat- 
ing out the massive-string modes leads to an 
effective action for the tachyon and the massless 
gauge field A,. It should be remarked that, contrary 
to the pure scalar case, noncommutative solitons can 
be constructed exactly for finite 0 in a system with 
gauge and scalar fields. Although the detailed form 
of the effective action is unknown, one has enough 
confidence to say what the true vacuum configura- 
tion is according to the Sen conjecture. One can then 
apply the solution-generating technique to generate 
new soliton solutions. In this manner, with a B-field 
of rank 2k, one can construct solutions which are 
localized in R% and represent a D(25 — 2k) brane. 
This is supported by the matching of the tension 
and the spectrum of fluctuations around the 
soliton configuration. Similar ideas can also be 
applied to construct D-branes in type II string 
theory. Again the starting point is an unstable 
brane configuration with tachyon field(s). There 
are two types of unstable D-branes: non-BPS Dp 
branes (p odd in IIA theory and p even in IIB 
theory) and BPS branes—antibranes Dp-Dp 


systems. A similar analysis allows one to identify 
the noncommutative soliton with the lower- 
dimensional BPS D-branes which arises from 
tachyon condensation. 

One main motivation for studying tachyon 
condensation in open-string theory is the hope 
that open-string theory may provide a fundamental 
nonperturbative formulation of string theory. It 
may not be too surprising that D-branes can be 
obtained in terms of open-string fields. However, 
to describe closed strings and NS branes in terms 
of open-string degrees of freedom remains an 
obstacle. 


Noncommwutative Instanton and Monopoles 


Instantons on noncommutative Rj} can be readily 
constructed using the Atiyah-Drinfeld-Hitchin- 
Manin (ADHM) formalism by modifying the 
ADHM constraints with a constant additive 
term. The result is that the self-dual (resp. anti- 
self-dual) instanton moduli space depends only on 
the anti-self-dual (resp. self-dual) part. The con- 
struction goes through even in the U(1) case. 
Consider a self-dual 0; the ADHM constraints for 
the self-dual instanton are the same as in the 
commutative case, and there is no nonsingular 
solution. On the other hand, the ADHM con- 
straints for the anti-self-dual instanton get mod- 
ified and admit nontrivial solutions. This 
noncommutative instanton solution is nonsingular 
with size V@. The noncommutative instanton 
represents a D(p-4) brane within a Dp brane. 
The ADHM constraints are just the D-flatness 
condition for the D-brane world-volume gauge 
theory. The additive constant to the ADHM 
constraints also has a simple interpretation as a 
Fayet-Iliopolous parameter which appears in the 
presence of a B-field. Although the ADHM 
method does not give a self-dual instanton, a 
direct construction can be applied to obtain non- 
ADHM self-dual instantons. Recall that the gauge 
field strength can be written as Fj = i[D;, Dj] + 6;"', 
where D; is given by the function on the right- 
hand side of [24]. Thus, a simple self-dual solution 
can be constructed with 


D; =10;°T'#T [36] 


where T is a partial isometry which satisfies 
TT'=1, but T'T—1-—P is not necessarily the 
identity. It is clear that P is a projector. The field 
strength 

F; —0;!P [37] 


t] 
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is self-dual and has instanton number n where n is 
the rank of the projector. 

On noncommutative R? (say 01^ — 0), BPS mono- 
poles satisfy the Bogomolny equation: 


V;ð= +B, i=1,2,3 [38] 


and can be obtained by solving the Nahm 
equation 


T1; = en T; TR us 6;30 [39] 


T; are k x k Hermitian matrices depending on an 
auxiliary variable z and k gives the charge of the 
monopole. Noncommutativity modifies the Nahm 
equation with a constant term, which can be 
absorbed by a constant shift of the generators. 
Therefore, unlike the case of instanton, the mono- 
pole moduli space is not modified by noncommuta- 
tive deformation. The Nahm construction has a 
clear physical meaning in string theory. The mono- 
pole (electric charge) can be interpreted as a D-string 
(fundamental string) ending on a D3 brane. One can 
also suspend k D-string between a collection of N 
parallel D3 braness; this would correspond to a 
charge k monopole in a Higgsed U(N) gauge theory. 
The matrices X’ correspond to the matrix transverse 
coordinates of the D-strings which lie within the D3 
branes. 


Further Topics 


Finally, in the following some further topics of 
interest are discussed briefly. 


1. The noncommutative geometry discussed here is 
of canonical type. Other deformations exist, for 
example, kappa-deformation and fuzzy sphere 
which are of the Lie-algebra type, and quantum 
group deformation which is a quadratic-type 
deformation: x'x’ =q™'R,)x*x!, whose consis- 
tency is guaranteed by the Yang—Baxter equation. 
It is interesting to see whether these noncommu- 
tative geometries arise from string theory. 
Another natural generalization is to consider 
noncommutative geometry of superspace. A 
simple example is to consider the fermionic 
coordinates to be deformed with the nonvanish- 
ing relation 


(0^. 9^ "3 Co? 40] 


where C°” are constants. It has been shown that 
[40] arises in certain Calabi-Yau compactification 
of type IIB string theory in the presence of RR 
background. The deformation [40] reduces the 
number of supersymmetries by half. Therefore, 
it is called .V — 1/2 supersymmetry. The 


noncommutativity |40] can be implemented on 
the superspace (y/, 0^, 0") as a star product for the 
6s. Unlike the bosonic deformation which 
involves an infinite number of higher derivatives, 
the star product for [40] stops at order C? due to 
the Grassmannian nature of the fermionic coordi- 
nates. Field theory with VV — 1/2 supersymmetry 
is local and differs from the ordinary M = 1 theory 
by only a small number of supersymmetry break- 
ing terms. The N— 1/2 Wess-Zumino model is 
renormalizable if extra F and F? terms are added 
to the original Lagrangian, where F is the auxiliary 
field. The N-—1/2 gauge theory is also 
renormalizable. 


. Integrability of a theory provides valuable infor- 


mation beyond the perturbative level. An integr- 
able field theory is characterized by an infinite 
number of conserved charges in involution. It is 
natural to ask whether integrability is preserved 
by noncommutative deformation. Noncommuta- 
tive integrable field theories have been con- 
structed. In the commutative case, Ward has 
conjectured that all (1+ 1)- and (2 + 1)-dimen- 
sional integrable systems can be obtained from 
the four-dimensional self-dual Yang-Mills equa- 
tion by reduction. Validity of the noncommuta- 
tive version of the Ward conjecture has been 
confirmed so far. It will be interesting to see 
whether it is true in general. 


. Locality and Lorentz symmetry form the corner- 


stones of quantum field theory and standard 
model physics of particles. Noncommutative field 
theory provides a theoretical framework where 
one can discuss effects of nonlocality and Lorentz 
symmetry violation. Possible phenomenological 
signals have been investigated (mostly at the tree 
level) and a bound has been placed on the extent 
of noncommutativity. A proper understanding 
and better control of the IR/UV mixing remains 
the crux of the problem. Noncommutative 
geometry may also be relevant for cosmology 
and inflation. 


. Like the standard AdS/CFT correspondence, the 


noncommutative gauge theory should also have 
a gravity-dual description. The supergravity 
background can be determined by considering 
the decoupling limit of D-branes with an NS 
B-field background. However, since the non- 
commutative gauge theory does not permit any 
conventional local gauge-invariant observable, 
the usual AdS/CFT correspondence that relates 
field theory correlators with bulk interaction 
does not seem to apply. It has been argued that 
generic properties such as the relation between 
length and momentum for open Wilson lines 
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can be seen from the gravity side. A more precise 
understanding of the duality map is called for. 


See also: Brane Construction of Gauge Theories; 
Deformation Quantization; Gauge Theories from Strings; 
Noncommutative Tori, Yang-Mills, and String Theory; 
Positive Maps on C'-Algebras; Solitons and Other 
Extended Field Configurations; String Field Theory; 
Superstring Theories. 
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Introduction 


Noncommutative tori are historically among the 
oldest and by now the most developed examples 
of noncommutative spaces. Noncommutative 
Yang-Mills theory can be obtained from string 
theory. This connection led to a cross-fertilization 
of research in physics and mathematics on Yang- 
Mills theory on noncommutative tori. One 
important result stemming from that work is the 
link between T-duality in string theory and 
Morita equivalence of associative algebras. In 
this article, we give an overview of the basic 
results in the differential geometry of noncommu- 
tative tori. Yang-Mills theory on noncommuta- 
tive tori, the duality induced by Morita 
equivalence and its link with T-duality are 
discussed. The noncommutative Nahm transform 
for instantons is introduced. 


Noncommutative Tori 
The Algebra of Functions 


The basic notions of noncommutative differential 
geometry were introduced and illustrated on the 
example of a two-dimensional noncommutative 
torus by Connes (1980). To define an algebra of 


functions on a d-dimensional noncommutative 
torus, consider a set of linear generators U, labeled 
by n€ Z^ — a d-dimensional vector with integral 
entries. The multiplication is defined by the 
formula 


Uh es i [1] 


where 0 is an antisymmetric dxd matrix, and 
summation over repeated indices is assumed. We 
further extend the multiplication from finite linear 
combinations to formal infinite series $ „ C(n)U,, 
where the coefficients C(m) tend to zero faster than 
any power of ||z||. The resulting algebra constitutes 
the algebra of smooth functions on a noncommuta- 
tive torus and will be denoted by T7. Sometimes for 
brevity we will omit the dimension label d in the 
notation of the algebra. We introduce an involution 
* in T? by the rule U7 = U_,. The elements U, are 
assumed to be unitary with respect to this involu- 
tion, that is, UZU,— U ,U,—12:Ug. One can 
further introduce a norm and take an appropriate 
completion of the involutive algebra T? to obtain 
the C*-algebra of functions on a noncommutative 
torus. For our purposes, the norm structure will not 
be important. A canonically normalized trace on T? 
is introduced by specifying 


tr s us 62.0 . [2] 


Projective Modules 


According to the general approach to noncommuta- 
tive geometry, finitely generated projective modules 
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over the algebra of functions are natural analogs of 
vector bundles. Throughout this article, when speak- 
ing of a projective module, we will assume a finitely 
generated left projective module. 

A free module (T7) is equipped with a TZ-valued 
Hermitian inner product (.,.);, defined by the 
formula 


ds. e NS (OL ues 


A projective module E is by definition a direct 
summand of a free module. Thus, it inherits the 
inner product (.,.)r,. Consider the endomorphisms 
of the module E, that is, linear mappings E— E 
commuting with the action of TZ. These endo- 
morphisms form an associative unital algebra 
denoted Endz,E. A decomposition (T4)N =E 6 E' 
determines an endomorphism P : (T)" — (T¢) that 
projects (T7)" onto E. The algebra Endr,E can then 
be identified with a subalgebra of Maty(T%,) — the 
endomorphisms of the free module (TZ). The latter 
has a canonical trace that is the composition of the 
matrix trace with the trace specified in [2]. By 
restriction, it gives rise to a canonical trace tr on 
End7,E. The same embedding also provides a 
canonical involution on Endr, E by a composition of 
the matrix transposition and the involution * on TZ. 

A large class of examples of projective modules 
over noncommutative tori are furnished by the 
so-called Heisenberg modules. They are constructed 
as follows. Let G be the direct sum of R^ and an 
abelian finitely generated group, and let G* be its 
dual group. In the most general situation 
G —R? x Z? x F where F is a finite group. Then 
G*2R? x T1 x F". 

Consider the linear space S(G) of functions on G 
decreasing at infinity faster than any power. We 
define operators U, 5 :S(G) — S(G) labeled by a 
pair (y, 7) € G x G* acting as follows: 


(Uc af ie) = (f x --y) [4] 


One can check that the operators U, +) satisfy the 
commutation relations 


| " — 
Uta U cu ji) —pn(yy (u) U uj) Ut; [5] 


If (^, 5) run over a d-dimensional discrete subgroup 
lCCGxG',PzZ, then formula [4] defines a 


module over a d-dimensional noncommutative 
torus T7 with 
exp(27i8) — (95); (e) (6) 


for a given basis (^j, 7;) of the lattice r. This module is 
projective if [ is such that G x G*/T is compact. 


If that is the case, then the projective T/-module at 
hand is called a Heisenberg module and denoted 
by Er. 

Heisenberg modules play a special role. If the 
matrix 0; is irrational in the sense that at least one 
of its entries is irrational, then any projective 
module over T? can be represented as a direct sum 
of Heisenberg modules. In that sense, Heisenberg 
modules can be used as building blocks to construct 
an arbitrary module. 


Connections 


Next we would like to define connections on a 
projective module over TZ. To this end, let us first 
define a Lie algebra of shifts Lọ acting on T? by 
specifying a basis consisting of derivations 
bj: TE — TE p ls»: xd satisfying 


6(U,) = 2xin;U, [7] 


These derivations span a d-dimensional abelian Lie 
algebra that we denote by Ly. 

A connection on a module E over TZ is a set of 
operators Vx: E E, X € Lg, depending linearly on 
X and satisfying 


[V x. U,] a dx(Un) [8] 


where U, are operators E— E representing the 
corresponding generators of T7. In the standard 
basis [7], this relation reads as 


[V U,| = 2a, U,, [9] 


The curvature of the connection Vx defined as the 
commutator Fxy =[Vx, Vy] is an exterior 2-form 
on the adjoint vector space L; with values in 
End44 E. 

Tj 


K-Theory: Chern Character 


The K-groups of a noncommutative torus coincide 
with those for commutative tori: 


FZ d—1 
Ko(TH) & Z2" eK, (T4) 


The Chern character of a projective module E 
over a noncommutative torus T? can be defined as 


ch(E) = tr exp (s) e Ae (r5) 10] 


where F is the curvature form of a connection on 

E, A***"(L2) is the even part of the exterior algebra 

of Lj and tr is the canonical trace on End4,E. This 
0 
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mapping gives rise to a noncommutative Chern 
character 


ch : Ko( T$) — Aeren (TS) 11] 


The component cho(E) — tr 1 = dim(E) is called the 
dimension of the module E. 

A distinctive feature of the noncommutative 
Chern character [11] is that its image does not 
consist of integral elements, that is, there is no 
lattice in L} that generates the image of the Chern 
character. However, there is a different integrality 
statement that replaces the commutative one. Con- 
sider a basis in Lj in which the derivations 
corresponding to basis elements satisfy [7]. Denote 
the exterior forms corresponding to the basis 
elements by a!,...,a%. Then an arbitrary element 
of A(L;) can be represented as a polynomial in the 
anticommuting variables a‘. Next let us consider the 
subset een cs avem pn that consists of poly- 
nomials in a having integer coefficients. It was 
proved by Elliott that the Chern character is 
injective and its range on Ko(T2) is given by the 
image of A*'*^(Z4) under the action of the operator 


This fact implies that the K-group Ko(T?) can be 
identified with the additive group I 

The K-theory class (E) € A** Z7) of a module E 
can be computed from its Chern character by the 
formula 


Note that the anticommuting variables o^ and the 
derivatives 0/Oo/ satisfy the anticommutation rela- 
tion (o/, 0/Oq/) -é. 

The coefficients of p(E) standing in front of 
monomials in af are integers to which we will 
refer as the topological numbers of the module E. 
These numbers can also be interpreted as numbers 
of D-branes of a definite kind although in non- 
commutative geometry it is difficult to talk about 
branes as geometrical objects wrapped on torus 
cycles. 

One can show that for noncommutative tori T7 
with irrational matrix 0; the set of elements of 
Ko(T2) that represent a projective module (i.e., the 
positive cone) consist exactly of the elements of 
positive dimension. Moreover, if 0j; is irrational, any 
two projective modules which represent the same 
element of Ko(T?) are isomorphic; that is, the 
projective modules are essentially specified in this 
case by their topological numbers. 


The complex differential geometry of noncommu- 


tative tori and its relation with mirror symmetry is 
discussed in Polishchuk and Schwarz (2003). 


Yang-Mills Theory on Noncommutative 
Tori 


Let E be a projective module over T7. We call a 
Yang-Mills field on E a connection V x-compatible 
with the Hermitian structure, that is, a connection 
satisfying 


(Vx€, 1), 4 te, V x1) 7, = 6x (KE, 1) ,) [13] 


for any two elements £,57 € E. Given a positive- 
definite metric on the Lie algebra Ly, we can define 
a Yang-Mills functional : 


V 


152 g^g'tr(E;Fy) [14] 
YM 


Sym(Vj;) = 


Here g" stands for the metric tensor in the canonical 
basis |7], V= ,/|detg|,gym is the Yang-Mills 
coupling constant, tr stands for the canonical trace 
on Endy,E discussed above, and summation over 
repeated indices is assumed. Compatibility with the 
Hermitian structure [13] can be shown to imply 
the positive definiteness of the functional Sym. The 
extrema of this functional are given by the solutions 
to the Yang-Mills equations 


g"[V,, Fi] =0 [15] 


A gauge transformation in the noncommutative 
Yang-Mills theory is specified by a unitary endo- 
morphism Z € Endrz,E, that is, an endomorphism 
satisfying ZZ* = Z*Z=1. The corresponding gauge 
transformation acts on a Yang-Mills field as 


VHV [16] 


The Yang-Mills functional [14] and the Yang- 
Mills equations [15] are invariant under these 
transformations. 

It is easy to see that Yang-Mills fields whose 
curvature is a scalar operator, that is, [V;, Vi] — 
oj: 1 with oj a real-number-valued tensor, solve the 
Yang-Mills equations [15]. A characterization of 
modules admitting a constant curvature connection 
and a description of the moduli spaces of constant 
curvature connections (i.e., the space of such 
connections modulo gauge transformations) 
is reviewed in Konechny and Schwarz (2002). 
Another interesting class of solutions to the Yang- 
Mills equations is instantons (see below). 

As in the ordinary field theory, one can construct 
various extensions of the noncommutative Yang- 
Mills theory [14] by adding other fields. To obtain a 
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supersymmetric extension of [14], one needs to 
add a number of endomorphisms Xņ € End7,E 
that play the role of bosonic scalar fields in the 
adjoint representation of the gauge group and a 
number of odd Grassmann parity endomorphisms 
we cIIEndr endowed with an  SO(d)-spinor 
index a. The latter ones are analogs of the usual 
fermionic fields. 

In string theory, one considers a maximally 
supersymmetric extension of the Yang-Mills theory 
[14]. In this case, the supersymmetric action depends 
on 10—d bosonic scalars Xj,1—4d,...,9, and the 
fermionic fields can be collected into an SO(9,1) 
Majorana-Weyl spinor multiplet $*,o — 1,...,16. 
The maximally supersymmetric Yang-Mills action 
takes the form 


V LV L 
SsyM = 4a" (EP + [Vu Xi] V^, X*] 
[Xi XX, XI] - wot JV, 9^ 
- 2u^ el Xi, VP] [17] 
Here the curvature indices F,,,41,v —0,...,d — 1, 
are assumed to be contracted with a Minkowski 


signature metric, and o^, are blocks of the ten- 
dimensional 32 x 32 gamma-matrices 


aß 
0 OA 

"E , Atel, 043 
(74) a3 0 


This action is invariant under two kinds of super- 
symmetry transformations denoted by 6,,6, and 
defined as 

bb = lo" Fac -- a" [V;, Xile - o" [X1, Xj]e) 
6. X; = eop) [18] 
ép =e 6V;-0, 6X,=0 


0, V; = €oj, 


where € is a constant 16-component Majorana-Weyl 
spinor. Of particular interest for string theory 
applications are solutions to the equations of motion 
corresponding to [17] that are invariant under some 
of the above supersymmetry transformations. 
Further discussion can be found in Konechny and 
Schwarz (2002). 


Morita Equivalence 


The role of Morita equivalence as a duality 
transformation in noncommutative Yang-Mills 
theory was elucidated by Schwarz (1998). We will 
adopt a definition of Morita equivalence for 
noncommutative tori which can be shown to be 
essentially equivalent to the standard definition of 
strong Morita equivalence. We will say that two 


noncommutative tori T? and T? are Morita equiva- 
lent if there exists a (Tj, T4)-bimodule O and a 
(T$, T2)-bimodule P such that 


OQ GrP S To, P@7,0 = T; [19] 


where Ty on the right-hand side is considered as a 
(Ty, Tg)-bimodule and analogously for T;. (It is 
assumed that the isomorphisms are canonical.) 
Given a Ty-module E one obtains a T;-module 
E, as 


E=P@r1,E [20] 


One can show that this mapping is functorial. 
Moreover, the bimodule O provides us with an 
inverse mapping O@r, E SE. 

We further introduce a notion of gauge Morita 
equivalence (originally called “complete Morita 
equivalence”) that allows one to transport 
connections along with the mapping of modules 
[20]. Let L be a d-dimensional commutative Lie 
algebra. We say that the (T$, T4) Morita equiva- 
lence bimodule P establishes a gauge Morita 
equivalence if it is endowed with operators 
VE,X€L that determine a constant curvature 
connection simultaneously with respect to T7 and 
75, that is, satisfy 


Vx (ea) = (Vke)a + e(6xa) 
Vx (ae) ^ a(Vie) + (dxa)e [21] 
[Vk Vy] =2rioxy -1 


Here óx and dy are standard derivations on Ty and 
T;, respectively. In other words, we have two Lie 
algebra homomorphisms 
ó: L— Lo, é: LoL, [22] 
If a pair (P, Vk) specifies a gauge (Tp, T;)- 
equivalence bimodule, then there exists a correspon- 
dence between connections on E and connections on 
E. The connection Vx on E corresponding to a 
given connection Vx on E is defined as 


Vx Vx 2189 Vx V..81 [23] 


More precisely, an operator 16 Vx +VX@1 on 
P &c E descends to a connection Vx on E— P &r, E. 
It is straightforward to check that under this 
mapping gauge equivalent connections go to gauge 
equivalent ones, 


ZIVXZ =Z'VxZ 


where Z = 1 & Z is the endomorphism of E = P &r, E 
corresponding to Z € End;,E. 
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The curvatures of Vy and Vy are connected by 
the formula 


Fyy = Pyy + loxy 24] 


which in particular shows that constant curvature 
connections go to constant curvature ones. 

Since noncommutative tori are labeled by an 
antisymmetric d x d matrix 0, gauge Morita equiva- 
lence establishes an equivalence relation on the set 
of such matrices. To describe this equivalence 
relation, consider the action 0 b0—0 of 
SO(d,d|Z) on the space of antisymmetric d xd 
matrices by the formula 


0— (M04- N)(R04-S) | [25] 


where the d x d matrices M, N, R, S are such that 


the matrix 
M N 
(M N) » 


belongs to the group SO(d, d|Z). The above action is 
defined whenever the matrix A =R 4- S is inverti- 
ble. One can prove that two noncommutative tori 
T? and T; are gauge Morita equivalent if and only if 
the matrices 0 and 0 belong to the same orbit of the 
SO(d, d|Z)) action [25]. 

The duality group SO(d,d|Z) also acts on the 
topological numbers of moduli p € A^" (Z4). This 
action can be shown to be given by a spinor 
representation constructed as follows. First note 
that the operators aà/— o, b; —0/0a! act on A(R?) 
and give a representation of the Clifford algebra 
specified by the metric with signature (d,d). The 
group O(d, d|C) can thus be regarded as a group of 
automorphisms acting on the Clifford algebra 
generated by a',b;. Denote the latter action by W, 
for hb € O(d, d|C). One defines a projective action Vj, 
of O(d,d|C) on A(R“) according to 


Via Vj! = Wy (a), Vpb; V," = Wp (b;) 
This projective action can be restricted to yield a 
double-valued spinor representation of SO(d,d|C) 
on A(R^) by choosing a suitable bilinear form on 
A(R“). The restriction of this representation to the 
subgroup SO(d,d|Z) acting on A*"(Z4) gives the 
action of Morita equivalence on the topological 
numbers of moduli. 

The mapping [23] preserves the Yang-Mills 
equations of motion [15]. Moreover, one can define 
a modification of the Yang-Mills action functional 
[14] in such a way that the values of the functionals 
on Vx and Vx coincide up to an appropriate 
rescaling of coupling constants. The modified action 
functional has the form 


V "I 
Sym = ag: "Fn +O, - 1)(FF + &* .1) [27] 


where 4/^ is a scalar-valued tensor that can be 
thought of as some background field. Adding this 
term will allow us to compensate for the curvature 
shift by adopting the transformation rule 


Qxy — Oxy — oxy 


Note that the new action functional [27] has the 
same equations of motion [15] as the original one. 

To show that the functional |27] is invariant 
under gauge Morita equivalence, one has to take 
into account two more effects. Firstly, the values of 
trace change by a factor c= dim (E)( dim(E)) ! as 
tr X — ctr X. Secondly, the identification of Ly and 
L; is established by means of some linear transfor- 
mation AP, the determinant of which will rescale the 
volume V. Both effects can be absorbed into an 
appropriate rescaling of the coupling constant. 

One can show that the curvature tensor, the 
metric tensor, the background field ;, and the 
volume element V transform according to 


iV — Ak EV AI " 
F; —AjPPQA;--0j 


gi = Af gu A" 
i [28] 
o; = APD, A: — Oij 


V = V|det A| 


where A — R0 +S and o — —RA'. The action func- 
tional [27] is invariant under the gauge Morita 
equivalence if the coupling constant transforms 
according to 


$a = £l det A| He [29] 


Supersymmetric extensions of Yang—Mills theory 
on noncommutative tori were shown to arise within 
string theory essentially in two situations. In the first 
case, one considers compactifications of the (BFSS 
or IKKT) matrix model of M-theory (Connes et al. 
1998). A discussion regarding the connection 
between T-duality and Morita equivalence in this 
case can be found in Seiberg and Witten (1999, 
section 7). Noncommutative gauge theories on tori 
can also be obtained by taking the so-called Seiberg- 
Witten zero slope limit in the presence of a Neveu- 
Schwarz B-field background (Seiberg and Witten 
1999). The emergence of noncommutative geometry 
in this limit is discussed in this article. Below we give 
some details on the relation between T-duality and 
Morita equivalence in this approach. Consider a 
number of Dp-branes wrapped on T? parametrized by 
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coordinates x' ~ x! + 27r with a closed-string metric 
G; and a B-field Bj. The SO(p, p[Z) T-duality group 
is represented by the matrices 


raji | (30) 


that act on the matrix 


r? , 
B —(G +270 B) 


Qr 


by a fractional transformation 
T : E+ E' = (aE +b)(cE +d) ' [31] 


The transformed metric and B-field are obtained by 
taking, respectively, the symmetric and antisym- 
metric parts of E’. The string coupling constant is 
transformed as 


cf [32] 


T:g, mM 
s (det(cE + d)) 


The zero slope limit of Seiberg and Witten is 
obtained by taking 


a! ~ e — 0, Gi~ e— 0 [33] 


Sending the closed-string metric to zero implies that 
the B-field dominates in the open-string boundary 
conditions. In the limit [33], the compactification is 
parametrized in terms of open-string moduli 


I2 =] 
b 1 , 

ov = — (BH 34 
which remain finite. One can demonstrate that 0" is 
a noncommutativity parameter for the torus and the 
low-energy effective theory living on the Dp-brane is 
a noncommutative maximally supersymmetric gauge 
theory with a coupling constant 


i detg T 


From the transformation law [31], it is not hard to 
derive the transformation rules for the moduli [34] 
in the limit [33], 


T : greg’ = (a + b0)g(a + be)’ 


i [36] 
T :0= 0' = (c + d0) (a + b0) 


Furthermore, the effective gauge theory becomes a 
noncommutative Yang-Mills theory [17] with a 
coupling constant 


; (o!) 8-92 


(gym) = (nc, 


which goes to a finite limit under [33] provided one 
simultaneously scales g; with e as 

g, e eo PRA 

where k is the rank of Bj. The limiting coupling constant 
gym transforms under the T-duality [31], [32] as 


T : gym gym = gym (det(a + b0))"" [37] 


We see that the transformation laws [31] and [37] 
have the same form as the corresponding transfor- 
mations in [25], [28], [29] provided one identifies 
matrix [26] with matrix [30] conjugated by 


0 1 
r-(1 0) 

The need for conjugation reflects the fact that in the 
BFSS M(atrix) model in the framework of which 
the Morita equivalence was originally considered, the 
natural degrees of freedom are DO branes versus Dp 
branes considered in the above discussion of T-duality. 

One can further check that the gauge field transfor- 
mations following from gauge Morita equivalence 
match with those induced by the T-duality. It is worth 
stressing that in the absence of a B-field background 
the effective action based on the square of the gauge 
field curvature is not invariant under T-duality. 


Instantons on Noncommutative 7; 


Consider a Yang-Mills field Vx on a projective 
module E over a noncommutative 4-torus T7. 
Assume that the Lie algebra of shifts Lg is equipped 
with the standard Euclidean metric such that the 
metric tensor in the basis [7] is given by the identity 
matrix. The Yang-Mills field V; is called an instanton 
if the self-dual part of the corresponding curvature 
tensor is proportional to the identity operator, 


F; = +( Fiz + Tips F"" ) = iuj, -1 [38] 


where w;, is a constant matrix with real entries. An 
anti-instanton is defined the same way by replacing 
the self-dual part with the anti-self-dual one. 

One can define a noncommutative analog of 
Nahm transform for instantons (Astashkevich et al. 
2000) that has properties very similar to those of the 
ordinary (commutative) one. To that end, consider a 
triple (P, V;, V) consisting of a (finite projective) 
(T7, T?)-bimodule P,Tj-connection V; and T$- 
connection V; that satisfy the following properties. 
The connection V; commutes with the T;-action on 
P and the connection V; with that of Ty. The 
commutators |V; V;], [Vi Vj], [Vis V;] are propor- 
tional to the identity operator 
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Vi 
Vi 
Vj 


The above conditions mean that P is a T7 
pÂ 

module and V; @ V; is a constant curvature connec- 
tion on it. In addition, we assume that the tensor oj 
is nondegenerate. 

For a connection V^ on a right T7-module E, we 
define a Dirac operator D —I'(VF + V;) acting on 
the tensor product 


: 


[39] 


4 44 
[ 
a È 
tee sa um 


Vj 
j 
j! 


(E Sr P)SS 


where S is the SO(4) spinor representation space and 
I’ are four-dimensional Dirac gamma-matrices. The 
space $ is Z5-graded: S— $* S^ and D is an odd 
operator so that we can consider 


D* : (E 8r, P) @S* — (E 8r, P) 8 S 
D : (E 8r, P) @S — (E 87, P) @S* 


A connection V^ on a Tj-module E is called 
P-irreducible if there exists a bounded inverse to the 
Laplacian 


A — 5 (Vi + V) (Vi * Vi) 


One can show that if V^ is a P-irreducible instanton, 
then ker D* 20 and DD+ =A. Denote by E the 
closure of the kernel of D^. Since D^ commutes with 
the T}-action on (E 8r, P) @S~ the space E is a right 
T?-module. One can prove that this module is finite 
projective. Let P:(E @7, P) 8 S- — E be a Hermitian 
projector. Denote by V^ the composition P o V. One 
can show that V. is a Yang-Mills field on E. 

The noncommutative Nahm transform of a 
P-irreducible instanton connection V^ on E is 
defined to be the pair (E, VE). One can further 
show that V“ is an instanton. 


See also: Electroweak Theory; Hopf Algebras and 
q-Deformation Quantum Groups; Noncommutative 
Geometry from Strings; Quantum Group Differentials, 
Bundles and Gauge Theory; Quantum Hall Effect; String 
Field Theory; von Neumann Algebras: Introduction, 
Modular Theory, and Classification Theory. 
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Nonequilibrium 


Systems in stationary nonequilibrium are mechanical 
systems subject to nonconservative external forces 


and to thermostat forces which forbid indefinite 
increase of the energy and allow reaching statisti- 
cally stationary states. A system X is described by 
the positions and velocities of its n particles X, X, 
with the particle positions confined to a finite 
volume container Co. 

If X —(xi,...,x,) are the particle positions in 
a Cartesian inertial system of coordinates, the 
equations of motion are determined by their masses 
m; >0,i1=1,...,”, by the potential energy of 
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interaction V(x4,...,x,) = V(X), by the external 
nonconservative forces F;(X,®), and by the thermo- 
stat forces —; as 


miði = —0,, V(X) + F(X; Ø) — Ù; LATE [1] 


where Ø=(¢p1,...,pq) are strength parameters on 
which the external forces depend. All forces and 
potentials will be supposed smooth, that is, analytic, 
in their variables aside from possible impulsive elastic 
forces describing shocks, and with the property 
F(X;0)=0. The impulsive forces are allowed here to 
model possible shocks with the walls of the container 
Co or between hard core particles. 

A thermostat is a “reservoir” which may consist 
of one or more infinite systems which are asympto- 
tically in thermal equilibrium and are separated by 
boundary surfaces from each other as well as from 
the system: with the latter, they interact through 
short-range conservative forces, see Figure 1. 

The reservoirs occupy infinite regions of the space 
outside Co, for example, sectors Ca C R3,a=1,2..., 
in space and their particles are in a configuration 
which is typical of an equilibrium state at temperature 
T,. This means that the empirical probability of 
configurations in each C, is Gibbsian with some 
temperature T;. In other words, the frequency with 
which a configuration (Y, Y + r) OCcurs in a region 
A +r C Ca while a configuration (W, W + r) occurs 
outside A + r (with Y C A, W N A = ()) averaged over 
the translations A +r of A by r (with the restriction 
that A +r C C;) is 


average(fA.,[(Y, Y +r); W, W 4 r]) 


r-ACC, 
e P (2m)|Y P vw) 
5 ae en Sp sra |2] 


normalization 


Here m, is the mass of the particles in the ath 
reservoir and V,(Y|W) is the energy of the short- 
range potential between pairs of particles in Y C C, 
or with one point in Y and one in W. Since the 
configurations in the system and in the thermostats 
are not random, [2] should be considered as an 
"empirical" probability in the sense that it is the 


Figure 1 A symbolic drawing of the container Co for the 
system X: and of the surrounding regions containing the particles 
acting as thermostats at temperatures 71, 75,.... 


frequency density of the events ((Y, Y +r); W +r}: 
in other words, the configurations @, in the 
reservoirs should be "typical" in the sense of 
probability theory of distributions which are asymp- 
totically Gibbsian. 

The property of being *thermostats" means that 
[2] remains true for all times, if initially satisfied. 

Mathematically, there is a problem at this point: 
the latter property is either true or false, but a 
proof of its validity seems out of reach of the 
present techniques except in very simple cases. 
Therefore, here we follow an intuitive approach 
and assume that such thermostats exist and, 
actually, that any configuration which is typical of 
a stationary state of an infinite size system of 
interacting particles in the C,’s, with physically 
reasonable microscopic interactions, satisfies the 
property [2]. 

The above thermostats are examples of “determi- 
nistic thermostats” because, together with the 
system, they form a deterministic dynamical system. 
They are called “Hamiltonian thermostats” and are 
often considered as the most appropriate models of 
“physical thermostats.” 

A closely related thermostat model is obtained by 
assuming that the particles outside the system are 
not in a given configuration but they have a 
probability distribution whose conditional distribu- 
tions satisfy [2] initially. Also in this case, it 1s 
necessary to assume that [2] remains true for all 
times, if initially satisfied. Such thermostats are 
examples of “stochastic thermostats” because their 
action on the system depends on random variables 
@, which are the initial configurations of the 
particles belonging to the thermostats. 

Other kinds of stochastic thermostats are “colli- 
sion rules” with the container boundary Co of X: 
every time a particle collides with 9C, it is reflected 
with a momentum p in d?p that has a probability 
distribution proportional to e-^(/29!P'd?p where 
B,,a— 1,2,... depends on which boundary portion 
(labeled by a — 1,2,...) the collision takes place and 
T,—(kpB,) and its “temperature” if kg is Boltz- 
mann's constant. Which p is actually chosen after 
each collision is determined by a random variable 
Q — (01,0»,...). 

The distinction between stochastic and deter- 
ministic thermostats ultimately rests on what we 
call “system.” If reservoirs or the randomness 
generators are included in the system, then the 
system becomes deterministic (possibly infinite); 
and finite deterministic thermostats can also be 
regarded as simplified models for infinite reservoirs, 
see the section “Heat, temperature, and entropy 
production.” 
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It is also possible, and convenient, to consider 
“finite deterministic thermostats.” In the latter case, 
2 is a force only depending upon the configuration 
of the z particles v of X in their finite container Cp. 

Examples of finite deterministic reservoirs are forces 
obtained by imposing a nonholonomic constraint via 
some ad boc principle like the Gauss principle. For 
instance, if a system of particles driven by a force 
GE 85. V(X) + F;(X) is enclosed in a box Cp and 2 
is a thermostat enforcing an anholonomic constraint 
w(X, X) = 0 via Gauss’ principle, then 


2;(X, X) 
$5 X; O U(X, X) + (1/m)G; - sv (X. X) 
1 (sv (X, X) 
x Oy v(X, X) [3] 


Gauss' principle says that the force which needs to 
be added to the other forces G; acting on the system 
minimizes 


given X,X, among all accelerations a; which are 
compatible with the constraint v. 

It should be kept in mind that the only known 
examples of mathematically treatable thermostats 
modeled by infinite reservoirs are cases in which the 
thermostat particles are either noninteracting parti- 
cles or linear (i.e., noninteracting) oscillators. For 
simplicity stochastic or infinite thermostats will not 
be considered here and we restrict attention to finite 
deterministic systems. 

In general, in order that a force 9 can be 
considered a deterministic "thermostat force" a 
further property is necessary: namely that the system 
evolves according to [1] towards a stationary state. 
This means that for all initial particle configurations 
(X, X), except possibly for a set of zero phase-space 
volume, any smooth function f (X, X) evolves in time 
so that, if S,(X, X) denotes the configuration into 
which the initial data evolve in time £ according to 
[1], then the limit 


Too 


T 
lim = f f(S,(X,X)) di =| f(z)u(dz) [4 


exists and is independent of (X,X). The probability 
distribution jz is then called the SRB distribution for 
the system. The maps $, will have the group 
property S$,- $, — S$,,, and the SRB distribution 4 
will be invariant under time evolution. 

It is important to stress that the requirement that 
the exceptional configurations form just a set of zero 


phase volume (rather than a set of zero probability 
with respect to another distribution, singular with 
respect to the phase volume) is a strong assumption 
and it should be considered an axiom of the theory: 
it corresponds to the assumption that the initial 
configuration is prepared as a typical configuration 
of an equilibrium state, which, by the classical 
equidistribution axiom of equilibrium statistical 
mechanics, is a typical configuration with respect 
to the phase volume. 

For this reason, the SRB distribution is said to 
describe a “stationary nonequilibrium state" of 
the system. The SRB distribution depends on the 
parameters on which the forces acting on the 
system depend, for example, |Co| (volume), € 
(strength of the forcings), {3,'} (temperatures), etc. 
The collection of SRB distributions obtained by 
letting the parameters vary defines a *nonequilibrium 
ensemble." 

In the stochastic case, the distribution ju is 
required to be invariant in the sense that it can be 
regarded as a marginal distribution of an invariant 
distribution for the larger (deterministic) system 
formed by the thermostats and the system itself. 

For more details, the reader is referred to Evans 
and Morriss (1990), Ruelle (1999), and Eckmann 
et al. (1999). 


Nonequilibrium Thermodynamics 


The key problem of nonequilibrium statistical 
mechanics is to derive a macroscopic “nonequili- 
brium thermodynamics" in a way similar to the 
derivation of equilibrium thermodynamics from 
equilibrium statistical mechanics. 

The first difficulty is that nonequilibrium thermo- 
dynamics is not well understood. For instance, there 
is no (agreed upon) definition of entropy of a 
nonequilibrium stationary state, while it should be 
kept in mind that the effort to find the microscopic 
interpretation of equilibrium entropy, as defined by 
Clausius, was a driving factor in the foundations of 
equilibrium statistical mechanics. 

The importance of entropy in classical equilibrium 
thermodynamics rests on the implication of univer- 
sal, parameter-free relations which follow from its 
existence (e.g., Oy(1/T)=Oy(p/T) if U is the 
internal energy, T the absolute temperature, and p 
the pressure of a simple homogeneous material). 

Are there universal relations among averages of 
observables with respect to SRB distributions? 

The question has to be posed for systems “really” 
out of equilibrium, that is, for ® Æ 0 (see [1]): in 
fact, there is a well-developed theory of the 
derivatives with respect to € of averages of 
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observables evaluated at  — 0. The latter theory is 
often called, and here we shall do so as well, 
“classical nonequilibrium thermodynamics” or 
“near-equilibrium thermodynamics” and it has 
been quite successfully developed on the basis of 
the notions of equilibrium thermodynamics, paying 
particular attention to the macroscopic evolution of 
systems described by macroscopic continuum equa- 
tions of motion. 

“Stationary nonequilibrium statistical mechanics” 
will indicate a theory of the relations between 
averages of observables with respect to SRB dis- 
tributions. Systems so large that their volume 
elements can be regarded as being in locally 
stationary nonequilibrium states could also be 
considered. This would extend the familiar “local 
equilibrium states” of classical nonequilibrium ther- 
modynamics: however, they are not considered here. 
This means that we shall not attempt to find the 
macroscopic equations regulating the time evolution 
of continua locally in nonequilibrium stationary 
states but we shall only try to determine the 
properties of their “volume elements” assuming 
that the timescale for the evolution of large 
assemblies of volume elements is slow compared to 
the timescales necessary to reach local stationarity. 

For more details, the reader is referred to 
de Groot and Mazur (1984), Lebowitz (1993), 
Ruelle (1999, 2000), Gallavotti (1998, 2004), and 
Goldstein and Lebowitz (2004). 


Chaotic Hypothesis 


In equilibrium statistical mechanics, the ergodic 
hypothesis plays an important conceptual role as it 
implies that the motions of ergodic systems have an 
SRB statistics and that the latter coincides with the 
Liouville distribution on the energy surface. 

An analogous role has been proposed for the 
“chaotic hypothesis,” which states that the 


motion of a chaotic system, developing on its attracting 
set, can be regarded as an Anosov flow. 


This means that the attracting sets of chaotic 
systems, physically defined as systems with at least 
one positive Lyapunov exponent, can be regarded as 
smooth surfaces on which motion is highly unstable: 


1. Around every point, a curvilinear coordinate 
system can be established which has three planes, 
varying continuously with x, which are covariant 
(i.e., they are coordinate planes at a point x 
which are mapped, by the evolution S;, into the 
corresponding coordinate planes around S;x). 


2. The planes are of three types, “stable,” “unstable,” 
and “marginal,” with respective positive dimen- 
sions d,,d,, and 1: infinitesimal lengths on the 
stable plane and on the unstable plane of any 
point contract at exponential rate as time 
proceeds towards the future or towards the past. 
The length along the marginal direction neither 
contracts nor expands (i.e., it varies around the 
initial value staying bounded away from 0 and 
oo): its tangent vector is parallel to the flow. In 
cases in which time evolution is discrete, and 
determined by a map S, the marginal direction is 
missing. 

3. The contraction over a time f£, positive for lines 
on the stable plane and negative for those on the 
unstable plane, is exponential, i.e. lengths are 
contracted by a factor uniformly bounded by 
Ce ^"! with C, « > 0. 

4. There is a dense trajectory. 


It has to be stressed that the chaotic hypothesis 
concerns physical systems: mathematically, it is 
very easy to find dynamical systems for which it 
does not hold, at least as easy as it is to find 
systems in which the ergodic hypothesis does not 
hold (e.g., harmonic lattices or blackbody radia- 
tion). However, if suitably interpreted, the ergodic 
hypothesis leads, even for these systems, to physi- 
cally correct results (the specific heats at high 
temperature, the Raleigh—Jeans distribution at low 
frequencies). Moreover, the failures of the ergodic 
hypothesis in physically important systems have led 
to new scientific paradigms (like quantum 
mechanics from the specific heats at low tempera- 
ture and Planck’s law). 

Since physical systems are almost always not 
Anosov systems, it is very likely that probing 
motions in extreme regimes will make visible the 
features that distinguish Anosov systems from non- 
Anosov systems, much as it happens with the 
ergodic hypothesis. 

The interest of the hypothesis is to provide a 
framework in which properties like the existence of 
an SRB distribution is a priori guaranteed, together 
with an expression for it which can be used to work 
with formal expressions of the averages of the 
observables: the role of Anosov systems in chaotic 
dynamics is similar to the role of harmonic oscillators 
in the theory of regular motions. They are the 
paradigm of chaotic systems, as the harmonic 
oscillators are the paradigm of order. Of course, the 
hypothesis is only a beginning and one has to learn 
how to extract information from it, as it was the case 
with the use of the Liouville distribution, once the 
ergodic hypothesis guaranteed that it was the 
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appropriate distribution for the study of the statistics 
of motions in equilibrium situations. 

For more details, the reader is referred to Ruelle 
(1976), Gallavotti and Cohen (1995), Ruelle (1999), 
Gallavotti (1998), and Gallavotti et al. (2004). 


Heat, Temperature, and Entropy 
Production 


The amount of beat Q that a system produces while in 
a stationary state is naturally identified with the work 
that the thermostat forces 2 perform per unit time 


Q- 5 vi ži [5] 


A system may be in contact with several reservoirs: 
in models, this will be reflected by a decomposition 


g= Y SUO X) [6] 


where 0” is the force due to the ath thermostat and 
depends on the coordinates of the particles which 
are in a region A; C Cy. of a decomposition 
U" iA; = Co of the container Cp occupied by the 
system (A; N A» — 0 if ada’). 

From several studies based on simulations of finite 
thermostatted systems of particles arose the proposal 
to consider the average of the phase-space contrac- 
tion o? (X, X) due to the ath thermostat 


AY vx def z d) (x 
o9 (X, X) = Y ^0, -0/ (X, X) [7] 


] 


and to identify it with the rate of entropy creation in 
the ath thermostat. 

Another key notion in thermodynamics is the 
temperature of a reservoir; in the infinite determi- 
nistic thermostat case, of the section “Nonequili- 
brium,” it is defined as (kg3,) ! but in the finite 
deterministic thermostats considered here it needs to 
be defined. If there are m reservoirs with which the 
system is in contact, one sets 


ol?) gx, x)». ol (X,X) udk dX) 
O, det yo? f X; 


where p is the SRB distribution describing the 
stationary state. It is natural to define the absolute 
temperature of the ath thermostat to be 


TX = (Qa) 
kgo” 


It is not clear that T, > 0: this happens in a rather 
general class of models and it would be desirable, for 


i8] 


[9] 


the interpretation that is proposed here, that it could 
be considered a property to be added to the require- 
ments that the forces 9/^' be thermostat models. 

An important class of thermostats for which the 
property T; > 0 holds can be described as follows. 
Imagine N particles in a container Co interacting via 
a potential Vo — 5 /;.; plq; — qj) + ?;; V'(q;) (where 
V' models external conservative forces like obsta- 
cles, walls, gravity, ...) and, furthermore, interacting 
with M other systems X, of N, particles of mass 
Ma, in containers C, contiguous to Co. The latter 
will model M parts of the system in contact with 
thermostats at temperatures T,,a=1,...,M. 

The coordinates of the particles in the ath system 
X, will be denoted x7,;— 1,..., Na, and they will 
interact with each other via a potential V,= 
y. a(x? —x?). Furthermore, there will be an 
interaction between the particles of each thermostat 
and those of the system via potentials W,= 
ualet usd —55).8-1.-2M. 

The potentials will be assumed to be either hard 
core or nonsingular potentials and the external V" is 
supposed to be at least such that it forbids the 
existence of obvious constants of motion. 

The temperature of each X, will be defined by 
the total kinetic energy of its particles, that 
is, by Ka= oN, (1/2)m,(G7) € (3/2) Ns b T;: the 
particles of the ath thermostat will be kept at 
constant temperature by further forces 97. The latter 
are defined by imposing via a Gaussian constraint 
that K, is a constant of motion (see [3] with 7 = Ka). 
This means that the equations of motion are 


Na 
mq; = —O4 (vo + » wo) 
aq—1 


malí = —Og(Va(x^) + Wa(Q, x”) — oF 


[10] 


and an application of Gauss’ principle yields 


LI 


a La = Va def 4.4 
UT 3N eT, 0 
where La is the work per unit time done by the 
particles in Cg on the particles of X; and V, is their 
potential energy. 
In this case, the partial divergence o^ = (3N, — 1)o7 
is, up to a constant factor (1— (1/3N,)), 


be — Vs 
— kaT, ka T; 


and it will make [9] identically satisfied with T; > 0 
because L, can be naturally interpreted as heat O, 
ceded, per unit time, by the particles in Co to the 
subsystem X, (hence to the ath thermostat because 
the temperature of X, is constant), while the 
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derivative of V, will not contribute to the value of 
9^ . The phase-space contraction rate is, neglecting 
the total derivative terms (and O(N; !)), 


N à 
: . O, 

true X,X — 11 

iue (X, X) 2. ET, [11] 


where the subscript “true” is to remind that an 
additive total derivative term distinguishes it from 
the complete phase-space contraction. 


Remarks 


(i) The above formula provides the motivation of the 
name “entropy creation rate” attributed to the 
phase-space contraction c. Note that in this way 
the definition of entropy creation is “reduced” to 
the equilibrium notion because what is being 
defined is the entropy increase of the thermostats 
which have to be considered in equilibrium. No 
attempt is made here to define neither the entropy 
of the stationary state nor the notion of tempera- 
ture of the nonequilibrium system in C, (the T; 
are temperatures of the X,, not of the particles in 
Co). This is an important point as it leaves open 
the possibility of envisaging the notion of “local 
equilibrium" which becomes necessary in the 
approximation (not considered here) in which 
the system is regarded as a continuum. 

(ii) In the above model, another viewpoint is 
possible: that is, to consider the system to 
consist of only the N particles in Cy and the M 
systems X, to be thermostats. From this point of 
view, it can be considered a model of a system 
subject to thermostats. The Gibbs distribution 
characterizing the infinite thermostats of the 
section *Nonequilibrium" becomes in this case 
the constraint that the kinetic energies K, are 
constants, enforced by the Gaussian forces. In 
the new viewpoint, the appropriate definition 
should be simply the right-hand side (RHS) of 
[11], i.e. the work per unit time done by the 
forces of the system on the thermostats divided 
by the temperature of the thermostats. This 
suggests a different and general definition of 
entropy creation rate, applying also to thermo- 
stats that are often considered “more physical” 
and that needs to be further investigated. In the 
example [10] the new definition differs from the 
phase space contraction rate by a total time 
derivative, i.e. rather trivially for the purposes of 
the following. 


For more details, the reader is referred to Evans 
and Morriss (1990), Gallavotti and Cohen (1995), 
Ruelle (1996, 1997), and Gallavotti (2004). 


Thermodynamic Fluxes and Forces 


Nonequilibrium stationary states depend upon 
external parameters g; like the temperatures T; of 
the thermostats or the size of the force parameters 
Q —(o1,...,94), see [1]. Nonequilibrium thermo- 
dynamics is well developed at *low forcing": strictly 
speaking, this means that it is widely believed that 
we understand the properties of the derivatives of 
the averages of observables with respect to the 
external parameters if evaluated at y; = 0. Important 
notions are the notions of thermodynamic fluxes J; 
and of thermodynamic forces y;; hence, it seems 
important to extend such notions to nonequilibrium 
systems (1.e., 7 0). 

A possible extension could be to define the 
thermodynamic flux J; associated with a force yj; as 
Ji-(040)g where o(X,X;®) is the volume 
contraction per unit time. This definition seems 
appropriate in several concrete cases that have been 
studied and it is appealing for its generality. 

An interesting example is provided by the model 
of thermostatted system in [10]: if the container of 
the system is a box with periodic boundary condi- 
tions, one can imagine to add an extra constant 
force E acting on the particles in the container. 
Imagining the particles to be charged by a charge e 
and regarding such force as an electric field, the first 
equation in [10] is modified by the addition of a 
term eE. 

The constraints on the thermostat temperatures imply 
that ø depends also on E: in fact, if J =e}; q; is the 
electric current, energy balance implies Utr — E - /— 
>> (Le = V,) if Uw is the sum of all kinetic and 
potential energies. Then, the phase-space contraction 


i ~~ Vs 
a T; 
can be written, to first order in the temperature 


variations óT, with respect to a common value 
Ta bs es 


+ ja Va ôT, _ E-J- Å 
—~ T T T 


hence Otrue, see [11], is 


E E» Qa óT, [12] 


The definition and extension of the conjugacy 
between thermodynamic forces and fluxes is com- 
patible with the key results of classical nonequili- 
brium thermodynamics, at least as far as Onsager 
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reciprocity and Green-Kubo's formulas are con- 
cerned. It can be checked that if the equilibrium 
system is reversible, that is, if there is an isometry I 
on phase space which anticommutes with the 
evolution (/$, — S ,1 in the case of continuous-time 
dynamics t — S, or IS=S"'I in the case of discrete- 
time dynamics S), then, shortening (X, X) into x, 


def à ; 
Lij = 09, |a. = 09, (08,0(x;®)) spplo—o = 9,Jile—o 


=14=5 | (0p, 0(Six;®)00,0(x;®))crgl@—odt — [13] 
The o(x;®) plays the role of “Lagrangian” generat- 
ing the duality between forces and fluxes. The 
extension of the duality just considered might be of 
interest in situations in which ®40. 

For more details, the reader is referred to de Groot 
and Mazur (1984), Gallavotti (1996), and Gallavotti 
and Ruelle (1997). 


Fluctuations 


As in equilibrium, large statistical fluctuations of 
observables are of great interest and already there is, 
at the moment, a rather large set of experiments 
dedicated to the analysis of large fluctuations in 
stationary states out of equilibrium. 

If one defines the dimensionless phase-space 
contraction 


p(x) = .[ 9m dt [14] 


T O 


(see also [11]), then there exists p* > 1 such that the 
probability P, of the event p € [a,b] with [a,b] C 
(—p*,p*) has the form 


P.(p € [a, b]) = const. e ™%relan S(P)+ OM) [15] 


with C(p) analytic in (—p*, p*). The function ¢(p) can 
be conveniently normalized to have value 0 at p — 1 
(1.e., at the average value of p). 

Then, in Anosov systems which are reversible and 
dissipative (see the previous section), a general 
symmetry property, called the *fluctuation theorem" 
and reflecting the reversibility symmetry, yields the 
parameterless relation 


Cp) =p) -pes DEPT) [16] 


This relation is interesting because it has no free 
parameters; in other words, it is universal for 
reversible dissipative Anosov systems. In connection 
with the flux-force duality in the previous section, it 
can be checked to reduce to the Green-Kubo 
formula and to Onsager reciprocity, see [13], in the 
case in which the evolution depends on several fields 
® and ®-— 0 (of course the relation becomes trivial 


as 6 — 0 because ce, —O and to obtain the result 
one has first to divide both sides by suitable powers 
of the fields 5). 

A more informal (but imprecise) way of writing 
[15] and [16] is 


P.(p) 
P,(—p) 


where P.(p) is the probability density of p. An 


obvious but interesting consequence of [17] is that 


(e "ong = 1 


in the sense that (1/7)log(e 7?"*)c&y ——» 0. 

Occasionally, systems with singularities have to be 
considered. In such cases, the relation [16] may 
change in the sense that the function ¢(p) may not be 
analytic: in such cases, one expects that the relation 
holds in the largest analyticity interval symmetric 
around the origin. In Anasov systems and also 
various cases considered in the literature, such 
interval appears to contain the interval (— 1, 1). 

Note that in the theory of fluctuations of the time 
averages p we can replace o by any other bounded 
quantity which is a total time derivative: hence, in the 
example discussed above, it can be replaced by Gerye, 
see [12], which has a natural physical meaning. 

It is important to remark that the above fluctua- 
tion relation is the first representative of several 
consequences of the reversibility and chaotic 
hypotheses. For instance, given F,,...,F, arbitrary 
observables which are (say) odd under time reversal 
I (i.e., F(Ix) 2 —F(x)) and given n functions t€ 
[-7/2,7/2] — plt) j= 1,...,n, one can ask which 
is the probability that F;(S, x) “closely follows" the 
"pattern" y;(t) and at the same time 


TELP 


T JO OL 


— eP% VOU 


for all p e (-p',p*) [17 


has value p. Then calling P (Fi ~ p15- +., Fn ~ ou, p) 
the probability of this event, which we write in the 
imprecise form corresponding to [17] for simplicity, 


and defining Iy;(t) € — o; (— t), it is 


P( Fy SE DFT g s aca g Lig ^9 Dm, D) — gP 
P (Fi ~ 1g1,.--, Fa ~ IPn, —p) 
p E (=p*,p*) |18] 


which is remarkable because it is parameterless and 
at the same time surprisingly independent of the 
choice of the observables F;. The relation [18] has 
far-reaching consequences: for instance, if » — 1 and 
Fı =0o,0(x;®) the relation [18] has been used to 
derive the mentioned Onsager reciprocity and 
Green-Kubo's formulas at  — 0. 
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Equation [18] can be read as follows: the 
probability that the observables F; follow given 
evolution patterns y; conditioned to entropy crea- 
tion rate po, is the same that they follow the time- 
reversed patterns if conditioned to entropy creation 
rate —po,. In other words, to change the sign of 
time, it is just sufficient to reverse the sign of 
entropy creation rate, no “extra effort" is needed. 

For more details, the reader is referred to Sinai 
: (1972, 1994), Evans et al. (1993), Gallavotti and 
Cohen (1995), Gallavotti (1996, 1999). Gallavotti 
and Ruelle (1997), Gallavotti et al. (2004), and 
Bonetto et al. (2005). 


Fractal Attractors, Pairing, 
and Time Reversal 


Attracting sets (i.e., sets which are the closure of 
attractors) are fractal in most dissipative systems. 
However, the chaotic hypothesis assumes that 
fractality can be neglected. Apart from the very 
interesting cases of systems close to equilibrium, in 
which the closure of an attractor is the whole phase 
space (under the chaotic hypothesis, i.e., if the 
system is Anosov), hence not fractal, serious 
problems arise in preserving validity of the fluctua- 
tion theorem. 

The reason is very simple: if the attractor closure 
is smaller than phase space, then it is to be expected 
that time reversal will change the attractor into a 
repeller disjoint from it. Thus, even if the chaotic 
hypothesis is assumed, so that the attracting set 
A can be considered a smooth surface, the motion 
on the attractor will not be time-reversal symmetric 
(as its time-reversal image will develop on the 
repeller). One can say that an attracting set with 
dimension lower than that of phase space in a time- 
reversible system corresponds to a spontaneous 
breakdown of time-reversal symmetry. 

It has been noted however that there are classes 
of systems, forming a large set in the space of 
evolutions depending on a parameter ®, in which 
geometric reasons imply that if beyond a critical 
value ^. the attracting set becomes smaller than 
phase space, then a map Ip is generated mapping the 
attractor A into the repeller R, and vice versa, such 
that I$ is the identity on AUR and Ip commutes 
with the evolution: therefore, the composition I- Ip 
is a time-reversal symmetry (i.e., it anticommutes 
with evolution) for the motions on the attracting set 
A (as well as on the repeller R). 

In other words, the time-reversal symmetry in 
such systems “cannot be broken”: if spontaneous 
breakdown occurs (i.e., A is not mapped into itself 


under time reversal I), a new symmetry [p is 
spawned and I - [p is a new time-reversal symmetry 
(an analogy with the spontaneous violation of time 
reversal in quantum theory, where time reversal T is 
violated but TCP is still a symmetry: so T plays the 
role of I and CP that of Ip). 

Thus, a fluctuation relation will hold for the 
phase-space contraction of the motions taking place 
on the attracting set for the class of systems with the 
geometric property mentioned above (technically, 
the latter is called *axiom C" property). 

This is interesting but it still is quite far from 
being checkable even in numerical experiments. 
There are nevertheless systems in which a “pairing 
property" also holds: this means that, considering 
the case of discrete-time maps S, the Jacobian matrix 
O.S(x) has 2N eigenvalues that can be labeled, 
in decreasing order, Ax(x), ..., A(1/2)n(X), ..., A1(X), 
with the remarkable property that (1/2)(An_;(x) + 
Aj(x) def (x) is j-independent. In such systems, a 
relation can be established between phase-space 
contractions in the full phase space and on the 
surface of the attracting set: the fluctuation theorem 
for the motion on the attracting set can therefore 
be related to the properties of the fluctuations of 
the total phase-space contraction measured on the 
attracting set (which includes the contraction trans- 
versal to the attracting set) and if 2M is the 
attracting set dimension and 2N is the total 
dimension of phase space it is, in the analyticity 
interval (—p*,p*) of the function C(p), 


M 
GC p) = Gp) - Po: [19] 


which is an interesting relation. It is however very 
difficult to test in mechanical systems because in 
such systems it seems very difficult to make the field 
so high to see an attracting set thinner than the 
whole phase space and still observe large 
fluctuations. 

For more details, the reader is referred to Dettman 
and Morriss (1996) and Gallavotti (1999). 


Nonequilibrium Ensembles 
and Their Equivalence 


Given a chaotic system, the collection of the SRB 
distributions associated with the various control 
parameters (volume, density, external forces,...) 
forms an “ensemble” describing the possible sta- 
tionary states of the system and their statistical 
properties. 

As in equilibrium, one can imagine that the 
system can be described equivalently in several 
ways at least when the system is large (“in the 
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thermodynamic" or “macroscopic limit"). In none- 
quilibrium, equivalence can be quite different and 
more structured than in equilibrium because one can 
imagine to change not only the control parameters 
but also the thermostatting mechanism. 

It is intuitive that a system may behave in the 
same way under the influence of different thermo- 
stats: the important phenomenon being the extrac- 
tion of heat and not the way in which it is extracted 
from the system. Therefore, one should ask when 
two systems are “physically equivalent," that is, 
when the SRB distributions associated with them 
give the same statistical properties for the same 
observables, at least for the very few observables 
which are macroscopically relevant. The latter may 
be a few more than the usual ones in equilibrium 
(temperature, pressure, density, etc.) and include 
currents, conducibilities, viscosities, etc., but they 
will always be very few compared to the (infinite) 
number of functions on phase space. 

As an example, consider a system of N interacting 
particles (say hard spheres) of mass m moving in a 
periodic box Co of side L containing a regular array 
of spherical scatterers (a basic model for electrons in 
a crystal) which reflect particles elastically and are 
arranged so that no straight line exists in Co which 
avoids the obstacles (to eliminate obvious constants 
of motion). An external field Eu acts also along the 
u-direction: hence, the equations of motion are 


mi; = f; + Eu —v; [20] 


where f; are the interparticle forces and those 
between scatterers and particles, and 2; are the 
thermostatting forces. The following thermostat 
models have been considered: 


1. 9; = vx; (viscosity thermostat), 

2. immediately after elastic collision with an obsta- 
cle the velocity is rescaled to a prefixed value 
v 3kgTm for some T (Drude's thermostat), 

3. 9; (E- Y x)/ Yi: (Gauss' thermostat). 


The first two are not reversible. At least not 
manifestly such, because the natural time reversal, 
that 1s, change of velocity sign, is not a symmetry 
(there might be however more hidden, hitherto 
unknown, symmetries which anticommute with 
time evolution). The third is reversible and time 
reversal is just the change of the velocity sign. The 
third thermostat model generates a time evolution in 
which the total kinetic energy K is constant. 

Let w, utuk be the SRB distributions for the 
system in a container Cy with volume |Co| = L? and 
density p — N/L? fixed. Imagine to tune the values 
of the control parameters v, T, K in such a way that 


(kinetic energy), =E, with the same E for w= 1, 
Ht, ug and consider a local observable F(X,X) > 0 
depending only on the coordinates of the particles 
located in a region A C Cs. Then a reasonable 


conjecture is that 


ger (F) ur L soc (F) m 2t 
N/L?—p ! N/L3=p r 
if the limits are taken at fixed F (hence at fixed A 
while L--oo). The conjecture is an open 
problem: it illustrates, however, the kind of ques- 
tions arising in nonequilibrium statistical mechanics. 
For more details, the reader is referred to Evans 
and Sarman (1993), Gallavotti (1999), and Ruelle 
(2000). 


Outlook 


The subject is (clearly) at a very early stage of 
development. 


1. The theory can be extended to stochastic thermo- 
stats quite satisfactorily, at least as far as the 
fluctuation theorem is concerned. 

2. Remarkable works have appeared on the theory 
of systems which are purely Hamiltonian and 
(therefore) with thermostats that are infinite: 
unfortunately, the infinite thermostats can be 
treated, so far, only if their particles are “free” at 
infinity (either free gases or harmonic lattices). 

3. The notion of entropy turns out to be extremely 
difficult to extend to stationary states and there 
are even doubts that it could be actually 
extended. Conceptually, this is certainly a major 
open problem. 

4. The statistical properties of stationary states out of 
equilibrium are still quite mysterious and surpris- 
ing: some exactly solvable models have appeared 
recently, and attempts have been made at unveil- 
ing the deep reasons for their solubility and at 
deriving from them general guiding principles. 

5. Numerical simulations have given a strong 
impulse to the subject; in fact, one can even say 
that they created it: introducing the model of 
thermostat as an extra microscopic force acting on 
the particles and providing the first reliable results 
on the properties of systems out of equilibrium. 
Simulations continue to be an essential part of the 
effort of research on the field. 

6. Approach to stationarity leads to many impor- 
tant questions: is there a Lyapunov function 
measuring the distance between an evolving 
state and the stationary state towards which it 
evolves? In other words, can one define an 
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analogous of Boltzmann's H-function? About this 
question there have been proposals and the answer 
seems affirmative, but it does not seem that it is 
possible to find a universal, system-independent, 
such function (search for it is related to the problem 
of defining an entropy function for stationary 
states: its existence is at least controversial, see the 
sections “Nonequilibrium thermodynamics” and 
“Chaotic hypothesis”). 

. Studying nonstationary evolution is much harder. 
The problem arises when the control parameters 
(force, volume,...) change with time and the 
system “undergoes a process.” As an example one 
can ask the question of how irreversible is a given 
irreversible process in which the initial state jug is a 
stationary state at time 7 —0, and the external 
parameters ®p start changing into functions ®(ż) 
of t and tend to a limit ®,, as t— oc. In this case, 
the stationary distribution jip starts changing and 
becomes a function ji; of t which is not stationary 
but approaches another stationary distribution ju. 
as t — oc. The process is, in general, irreversible 
and the question is how to measure its “degree of 
irreversibility": for simplicity we restrict attention 
to very special processes in which the only 
phenomenon is heat production because the 
container does not change volume and the energy 
also remains constants, so that the motion can be 
described at all times as taking place on a fixed 
energy surface. A natural quantity Z associated 
with the evolution from an initial stationary state 
to a final stationary state through a change in the 
control parameters can be defined as follows. 
Consider the distribution jz; into which uo evolves 
in time £, and consider also the SRB distribution 
læn) corresponding to the control parameters 
“frozen” at the value at time t, that is, ®(t). Let 
the phase-space contraction, when the forces are 
“frozen” at the value (1), be o;(x) = o(x;@(t)). In 
general pu, ~ Haw- Then, 


TADH torn) [ (ula 
— pa) (01))* dt [22] 


can be called the degree of irreversibility of the 
process: it has the property that in the limit of 
infinitely slow evolution of ®(t), for example, if 
@M(t)=@)+(1—e ?")A(a quasistatic evolution 
on timescale y '« from 6, to n — 6, +A), 
the irreversibility degree 7. —70 if (as in the case 
of Anosov evolutions, hence under the chaotic 
hypothesis) the approach to a stationary state is 
exponentially fast at fixed external forces ®. The 


quantity Z is a time scale which could be 


interpreted as the time needed for the process to 
exhibit its irreversible nature. 


The entire subject is dominated by the initial 
insights of Onsager on classical nonequilibrium 
thermodynamics, which concern the properties of 
the infinitesimal deviations from equilibrium (i.e., 
averages of observables differentiated with respect 
to the control parameters and evaluated at ® = 0). 
The present efforts are devoted to studying proper- 
ties at 0 0. In this direction, the classical theory 
provides certainly firm constraints (like Onsager 
reciprocity or Green-Kubo relations or fluctuation- 
dissipation theorem) but at a technical level, it gives 
little help to enter the terra incognita of none- 
quilibrium thermodynamics of stationary states. 

For more details, the reader is referred to 
Kurchan (1998), Lebowitz and Spohn (1999), 
Maes (1999), Eckmann et al. (1999), Bonetto 
et al. (2000, 2005), Eckmann and Young (2005), 
Derrida et al. (2001), Bertini et al. (2001), Evans 
and Morriss (1990), Evans et al. (1993), Goldstein 
and Lebowitz (2004), and Gallavotti (2004). 


See also: Adiabatic Piston; Chaos and Attractors; 
Ergodic Theory; Lie, Symplectic, and Poisson Groupoids 
and Their Lie Algebroids; Macroscopic Fluctuations and 
Thermodynamic Functionals; Nonequilibrium Statistical 
Mechanics: Dynamical Systems Approach; Quantum 
Dynamical Semigroups; Random Dynamical Systems. 
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Time Evolution of Infinite-Particle 
Systems 


A preliminary problem in the rigorous study of 
nonequilibrium statistical mechanics is to give a 
precise sense to the time evolution of infinitely 
extended systems. In fact, statistical mechanics deals 
with systems composed by a very large number of 
bodies (of the order of 107°) and studies the 
properties of such systems which are related to 
their large number of degrees of freedom. Mathe- 
matically, this aspect is stressed by introducing the 
so-called “thermodynamical limit," that is, by 
defining and analyzing systems with infinite degrees 
of freedom. For particle systems, the problem can be 
formulated in the following way. A phase point of 
the system is an infinite sequence {(x;,v;)};-x of the 
positions and velocities of the particles, and its time 
evolution is characterized by the solutions of the 
Newton equations: 


mk(t)- X F(x(t)—x(t)), ieN [I] 


JeN 3 x. 


where 77 is the mass of each particle, F(x) = — V(x), 
and ® is a two-body potential. Equation [1] must be 
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completed by the initial data {(x;(0),v;(0))};en. The 
time evolution of a phase point implies in a natural 
way the time evolution of functions on the phase 
space, which are the observables to be compared with 
experiments. 

The existence of a solution to eqn [1] is not 
obvious, because the classical theorem of existence 
and uniqueness for the Cauchy problem of the 
Newton equations depends on the number of 
degrees of freedom of the system. The main 
difficulty is that a priori the time evolution can 
bring infinitely many particles in a bounded region 
within a finite time, so that the right-hand side of 
eqn [1] becomes meaningless. Without any hypoth- 
esis on the initial conditions, this can happen, as 
shown by the following simple example. Consider a 
system of free (noninteracting) particles moving 
on the real line with initial conditions x; =i, v; = —1, 
i € IN. It is clear that at time t=1 all the particles 
are at the origin. To forbid this “collapse,” we must 
restrict the allowed initial conditions, but we cannot 
be too drastic. For instance, we could surely avoid 
these pathologies by choosing the initial velocities 
uniformly bounded and the initial distribution. of 
particles locally finite. But the set of such data is 
exceptional with respect to the Gibbs state (as it can be 
easily shown using that, at equilibrium, the velocities are 
independent identically distributed Gaussian variables). 
In conclusion, we must construct the dynamics for initial 
conditions which are chosen in a set sufficiently large to 
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be the support of states of interest from a thermo- 
dynamical point of view. 

The difficulty of the problem increases with the 
spatial dimension d, as it is shown by the following 
example. Let the potential ® be smooth enough and 
short range and assume that, initially, the velocities 
and the density are bounded, that is, 


N(X; u, R) euin 


p 


sup |v;| < oo, sup 
i ucR?,R»1 


where X = {(x;, vi)};ey is the particle configuration and 
N(X; u, R) is the number of particles in the ball of radius 
R, centered at u. If V(t) denotes the modulus of the 
maximal velocity carried by the particle during the time 
[0, 7] and X(t) the evolved configuration, the conserva- 
tion of the particles number yields 


N(X(t); u, Ro) € N(X(0); u, R(t)) € const. R(t)4 
[3] 


where 


R(t) = Ro + [ ds V(s) [4| 


On the other hand, V(s) is controlled by the force, 
which turns out to be bounded by sup, N(X(s); 14,7), 
where r > 0 is the range of the potential. By virtue of 
eqns [3] and [4], we arrive at the integral inequality: 


t 
R(t) € Ro + const. t + const. Í dsR(s)4 [5] 
0 


which is solvable globally in time only if d=1. 

In the case of interest, from a thermodynamical 
point of view, we also need to allow fluctuations of 
the density and velocities, which add further 
difficulties. The existence, uniqueness, and locality 
of the motion has been solved in dimension d — 1 for 
almost all relevant interactions (Lanford 1968, 
Dobrushin and Fritz 1977), and in dimension d — 2 
for interactions not too singular at the origin (Fritz 
and Dobrushin 1977). (This does not cover, for 
instance, the hard-core interactions, where it is still 
an open problem to investigate whether the 
dynamics evolves toward a close-packing situation.) 
Finally, in dimension d — 3, the result has recently 
been proved only for bounded, non-negative, finite- 
range interactions (Caglioti e£ al. 2000). 

We state the result for the three-dimensional case. 
Let the interaction ® depend only on the mutual 
distance, be twice differentiable, positive in the 
origin and, for the moment, also non-negative and 
compactly supported. We assume that the initial 
data have bounded local energies and densities, with 


at most logarithmic divergences in velocities and 
densities. More precisely, we define 


Q(X; u, R) = $ / x(Ixi — wl < R) 
IEN 
m 


"i iso ate uai [6] 
2 Ai Mi 


x 


where (A) denotes the characteristic function of the 
set A so that eqn [6] gives the energy and density 
contained in a ball centered at u with radius R. 
Define 


Xu R 
Qa(X)=sup sup LBA č gm 
H RR>ġalu) 
where a > 0 and 
bo (x)=log*(e + |x|), x € R? [8] 


We denote by X, the set of the phase points X such 
that O,(X) < oc. It is possible to prove that for any 
a > 1/3, X, has full measure with respect to any 
Gibbs measure. 

We define the partial dynamics t — X~ (t) as the 
solutions to eqn [1] obtained by neglecting all the 
particles which are initially outside the ball of radius 
n and centered at the origin. 


Theorem If X € X, there exists a unique flow 
X — X(t) € Xona satisfying eqn |1] with X(0) =X. 
Moreover, the partial dynamics locally converges to 
X(t)asn — oc. 


The result has been extended to bounded super- 
stable long-range interactions. The (nontrivial) proof 
is based on several steps: we introduce a mollified 
version on the local energy and study its evolution in 
time under the partial dynamics. The energy 
conservation allows us to prove that the local energy 
grows at most as the cube of the maximal velocity. 
On the other hand, a suitable time average allows us 
to control the maximal velocity via the local energy 
in an appropriate way. The result is achieved by 
letting  — oc. 


Long-Time Behavior 


Existence and locality of the dynamics is only a first, 
preliminary, step. The next and much more subtle 
question concerns the asymptotic (in time) and the 
statistical properties of the motion. Here, the main 
problem is the absence of simple but nontrivial 
models. Let us explain this point by a comparison 
with the situation in equilibrium statistical 
mechanics. In this case, even the simpler model, 
the free-particle system, exhibits all the relevant 
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thermodynamical properties of real systems away 
from the critical regime. In fact, the effort is 
often reduced to rigorously proving that the real 
systems away from the critical region behave as a 
free-particle system. The presence of the interaction 
is instead essential to describe phase transitions. 

In the case of nonequilibrium statistical mechanics 
there are very few solvable models (free particles, 
chain of oscillators, hard-core system in one dimen- 
sion), and typically they do not catch the essential 
properties of the real systems. For example, let us 
consider a system which is close to equilibrium and 
ask whether it converges to the corresponding Gibbs 
state. Two possible mechanisms usually come together: 
the dispersive properties of the matter (by which 
perturbations “escape” to infinity) and the mixing 
properties (by which perturbations are “spread” and 
disappear). The former is present also in the free-particle 
system, being responsible of its ergodic properties. The 
latter requires a deep analysis of the dynamics of 
interacting-particle systems and it is too difficult to be 
analyzed except in rare cases. 

We just mention the case of systems with 
instantaneous interaction, which are simple enough 
to be studied but nevertheless exhibit a nontrivial 
long-time behavior. We recall in particular the 
famous Sinai’s billiard: a particle moving freely in 
a two-dimensional torus except for elastic collisions 
with the boundary of a convex obstacle. As proved 
by Sinai (1970), this system has strong ergodic 
properties. Sinai’s billiard can be proved to be 
equivalent to the “Lorentz gas” in which the 
obstacles are dislocated in a periodic way. 
Bunimovich and Sinai (1981) proved that when 
the obstacles are close enough to each other, the 
diffusive (weak) limit of the particle motion is the 
Wiener process. This remarkable result gives a 
rigorous derivation of Brownian motion from a 
Hamiltonian system. 

More recently, similar questions have been inves- 
tigated in the case of a charged particle subject to a 
constant electric field and interacting with a medium 
described by a particle system. Several rigorous 
results have been obtained on this subject. We only 
recall those by Boldrighini and Soloveitchik (1995, 
1997). In the context of a simplified model, the 
asymptotic motion of the charged particle is 
described as a drift plus a Brownian motion, and 
the Einstein relation between the drift and the 
diffusion constant is established. 


Mean-Field Limit 


The validity of any model is related to some 
approximation limit. In statistical mechanics, we 


encounter one of the most important ones, the 
*thermodynamical limit," used to stress the effect of 
large number of particles. Here we briefly discuss the 
*mean-field limit." For the kinetic, Boltzmann-Grad 
limit, see Boltzmann Equation (Classical and Quan- 
tum) and Kinetic Equations. 

We consider N particles of mass m mutually 
interacting via the force F. The equations of motion are 


me(t)= >  JBFüot)—x(t)) 


j=1.,...,.N:jFi 
(x;(0), x;(0)) = (x; vi) [9] 
pes... 


We consider a system with N very large, the mass m 
of each particle very small, and the interaction very 
weak. An interesting situation arises when the 
quantities N, m, and F are linked by the relations 


M G 
m= Ñ" F- N2 [10] 
for some function G. Of course, M is the total mass 
of the system. 

We are interested in investigating the limit N — ox. 
We assume that the initial data are chosen in a way 
that the empirical measure N^! $7,646, weakly 
converges (as N — oc) to the absolutely continuous 
measure fo(x,v)dxdv with some smooth density 
fo(x, v). We ask whether at some positive time t > 0 
the empirical measure N ! 37,6450, weakly con- 
verges to f(x,v,t)dxdv with a density f(x,v,t) 
satisfying some limiting evolution equation. 

Formally, it is easy to find this equation: by the 
Liouville theorem, a continuous medium in which 
each point moves under the action of an acceleration 
field behaves as an incompressible fluid. The 
continuity equation becomes 


O,f (x,v,t) +u-Vxf (x,v,t) + E- Vyf (x,v,t) =0 


11] 
f (x. v, 0) = fo(x,v) 
where 
E(x,t)= | dyGe-»pt.n [12 
R 
p(x,t) = dv f (x, v, t) [13] 


JR? 


This equation can be studied by following the 
characteristics, for which it suffices to look at the 
pair of functions 


(X (x, v, t), V(x, v. t)), 


(x, v) ^ fo(x, v) > f (x, v, t) 
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where (x,v) € R? x R? and t € R, solutions of 


X(x,U,t) = V(x,t,t), V(x,v,t) = E(x,t) 
X(x,t,0) =x, V(x n 0) =v [14] 
f (X (x,v,t), V(x,v,t),t) = fo(x,v) 


This is a weak formulation of eqn [11], in the sense 
that any smooth solution to eqn [11] satisfies eqn 
[14], but this last equation in meaningful also for 
nonsmooth functions. This is a weak version of the 
Vlasov equation and its measure solutions will play 
an important role in the sequel. 

Equations [11]-[14] are called Vlasov equations, 
after Vlasov, who first introduced them in plasma 
physics. They have a Hamiltonian structure and 
conserve several quantities: the total mass, the total 
energy, the Liouville measure dx dv, and in general 
each moment of this measure. 

The existence and uniqueness of the solutions 
has been studied in many papers. Two cases have 
to be considered, depending on whether the total 
mass 


M = fi dx dv fo(x, v) [15] 
JRE 


is finite or not. We start with the first case. If the 
interaction G is bounded, the analysis is easy. On 
the other hand, in plasma physics one deals with 
the Coulomb interaction, which is singular at the 
origin. In this case (where eqn [11] is usually 
called the Vlasov-Poisson equation), existence and 
uniqueness can still be proved, but it is not 
straightforward, especially in dimension d=3. 
The case with the complete Lorentz force, also 
taking into account the relativistic effect, is much 
more difficult. 

For infinite total mass, the problem has been 
solved recently in three (or lower) dimensions for 
bounded, non-negative, finite-range interactions, 
and in two dimensions for singular Helmholtz 
interactions. 

Another way to relate the Vlasov equation with 
the particle systems is to consider the usual 
transition from microscopic to macroscopic evolu- 
tions based on a separation between microscopic 
and macroscopic scales. Moreover, the force 
between the particles is due to a long-range pair 
interaction of the Kac type, in which the range 
parameter tends to infinity as the ratio e” 
between the macro and the micro spatial scale: 
F(x; — x;) = e?^**!G(ex; — ex;). Finally, the mass of 
the particles is proportional to c^: m=e%. After 
rescaling space and time by a factor e, in the 
macroscopic variables (7,r)—(&et,ex), the equa- 
tions of motion (eqn [9]) become 


dr; 
"s » e^ G(r; — r;) [16] 


Pra 


Then eqn [14] is the limiting equation as € — 0. 


Other Models 


We mention another model of larger interest. We 
introduce it in the simplest formulation, leaving 
possible generalizations to the reader. 

We consider an infinite chain of anharmonic 
oscillators, with Hamiltonian H given by 


H(q, p) 


2 
=z tag tb , (qi- ai) *eqi +4 


icZ j:\i-7|=1 
[17] 


where qj, pi € R, a> 0, b,c,d > 0. 

When a= 0, it reduces to the well-known chain of 
harmonic oscillators, which is integrable and widely 
studied in the literature. 

The time evolution defined by the Hamiltonian in 
eqn [17] exists and it is unique for initial data 
chosen in a set large enough to be the support of 
any reasonable thermodynamic (equilibrium or 
nonequilibrium) state. This can be achieved by 
proving integral inequalities for the “Lyapunov 
function" 


p; 4 

L(q.p) E- BE + aq +d] F 

It is interesting to note that uniqueness holds only in 
a class of data such that the position of the ith 
oscillator does not increase too much as |i|— oc. 
For example, besides the stationary solution 
qi(t) —0,i € Z, we can construct a different solution 
corresponding to the same initial conditions 
q;(0)—0, p,((0)—0,;€ Z. In fact, by imposing 
qo(t) =t and q;(t) ^ q ;(t), we can solve recursively 
the equations of motion and obtain a nonzero 
solution qilt), which however increases superexpo- 
nentially as |;| — oc. 

The Hamiltonian dynamical systems (classical or 
quantum) are surely quite faithful descriptions of 
real systems, but they are too difficult to study. 
Mainly it is not known how to prove good 
dynamical mixing for deterministic evolutions with 
many degrees of freedom. Therefore, stochastic 
evolutions have been introduced to model the real 
systems. More precisely, one renounces a full 
description of the microscopic dynamics, introdu- 
cing simplified models where the effects of the 
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“hidden degrees of freedom" are taken into account 
by adding suitable stochastic forces. Many useful 
results have been obtained, which show that these 
stochastic model systems exhibit a macroscopic 
behavior much closer to that observed in nature. 
The main criticism concerns the role of stochasticity, 
which in these models is introduced ab initio. In 
other words, if one believes that the statistical 
properties of the deterministic motion on the small 
scale determine the collective behavior of systems 
with many degrees of freedom, then these properties 
do have to be proved for a true understanding of 
nonequilibrium phenomena. 


See also: Adiabatic Piston; Boltzmann Equation 
(Classical and Quantum); Fourier Law; Kinetic Equations; 
Nonequilibrium Statistical Mechanics (Stationary): 
Overview; Nonequilibrium Statistical Mechanics: 
Interaction between Theory and Numerical Simulations. 
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Introduction 


Nonequilibrium statistical mechanics concerns a 
wide range of fundamental problems and applica- 
tions. Perturbative methods are quite effective for 
approaching weakly nonlinear problems, usually 
relying upon effective coarse-grained equations. 
The attempt of obtaining a microscopic description 
of genuine nonlinear problems demands the com- 
bined use of theoretical methods and numerical 
simulations. The proprotypic case is the numerical 
experiment performed by Fermi, Pasta, and Ulam 
in 1955. As we discuss in the following section, the 
main questions, which had inspired this experi- 
ment, remained without an answer for a long time, 
while new puzzling problems emerged. Despite its 
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apparent failure, the Fermi-Pasta-Ulam (FPU) 
experiment represents a remarkable example in 
the history of science of how a good guess may be 
the source of many fruitful achievements. Part of 
them are discussed in the section on energy 
relaxation in nonlinear chains, where we summar- 
ize the present understanding of the very slow 
relaxation mechanism, characterizing the dynamics 
of nonlinear chains of oscillators, like the FPU 
model, at low energies. Next, we report one further 
success of the interplay between theory and 
numerics, that is, the formulation of a generalized 
fluctuation-dissipation relation for stationary pro- 
cesses. Finally, we survey the main achievements 
concerning the study of anomalous transport 
properties in low-dimensional systems. In particu- 
lar, we focus our attention on the heat conduction 
in nonlinear lattices. Lacking a general hydrody- 
namic theory, also in this case computer simula- 
tions and theoretical arguments have greatly 
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contributed to clarify the general scenario, unveil- 
ing surprising aspects, which, up to a few years 
ago, were completely unexpected. 


The Numerical Experiment by Fermi, 
Pasta, and Ulam 


The impressive progress of electronic technology 
during World War II made possible the design of the 
first digital computers. The equally impressive 
budgets for their production and maintenance 
could only be justified by their employment in 
classified military research. Nonetheless, some of 
the outstanding scientists involved in these 
researches, like E Fermi, immediately realized the 
great potential of these new machines for tackling 
also some fundamental problems in basic science. 

Fermi had in his mind a crucial and still open 
physical problem. In 1914 the Dutch physicist 
P Debye had suggested that the finiteness of thermal 
conductivity in crystals should be due to the 
nonlinear forces acting among the constituent 
atoms. Forty years later a microscopic theory of 
transport processes, including nonlinear effects, was 
still lacking. Actually, technical difficulties pre- 
vented a theoretical approach based on analytic 
methods. Numerical integration of the equations of 
motion by a digital machine appeared to Fermi as 
an effective way for tackling this problem. In 
collaboration with the mathematician S Ulam and the 
physicist J Pasta, Fermi used MANIAC 1 (a proto- 
type digital computer installed at Los Alamos National 
Laboratories, USA) for integrating the dynamical 
equations of the simplest mathematical model of 
an anharmonic crystal: a chain of N harmonic oscilla- 
tors, coupled by nonlinear forces. Its Hamiltonian 
reads 


" | 
Hey a 3 di - qi 


( B . 
+5 (qi — qi)? + 4 Mia - qi [1] 


where w is the harmonic frequency, while œ and 8 
are the positive coupling constants of the nonlinear 
terms. The integer space index i labels the oscillators 
along the chain, while q; and p; are the displacement 
from the equilibrium position and the momentum of 
the ith oscillator, respectively. The potential energy 
is the general form taken by any nonlinear interac- 
tion potential, when expanded, up to fourth order, 
around its equilibrium position. This choice guaran- 
tees the boundedness of trajectories for any finite 
energy. 


Accordingly, the model contains the minimal 
basic ingredients, needed for testing the conjecture 
about the finiteness of thermal conductivity. 

The equations of motion 


ðH __OH 
— Opi Di = Ogi 


di 2] 
were integrated numerically by an algorithm, where 
space and time derivatives were approximated by 
proper finite-difference expressions. 

The choice of the initial conditions was motivated 
by a further basic question concerning Fermi and his 
collaborators. In fact, they aimed at verifying also a 
common belief that had never been proved rigor- 
ously: in an isolated mechanical system with many 
degrees of freedom (i.e., made of a large number of 
oscillators), a generic nonlinear interaction among 
them should eventually yield equilibrium through 
“thermalization” of the energy. On the basis of 
physical intuition, nobody would object to this 
expectation if the mechanical system would start 
its evolution from an initial state very close to 
thermodynamic equilibrium. Nonetheless, the same 
should be observed by considering an initial state, 
where energy is supplied to a small subset of 
oscillatory modes of the crystal. At variance with a 
finite system of linear oscillators, where each 
initially excited mode keeps its energy constant, 
nonlinear terms should make the energy flow 
towards all oscillatory modes, until thermal equili- 
brium is eventually reached. Thermalization corre- 
sponds to energy equipartition among all the modes. 
This statement has to be interpreted in a statistical 
sense: the time averages of the energies contained in 
the modes converge to the same constant value. But 
if this was the case, one further fundamental aspect 
concerning the evolution towards thermodynamic 
equilibrium could be checked. In the formulation of 
his transport equation, L Boltzmann had conjec- 
tured that thermodynamic irreversibility can emerge 
from microscopic reversible dynamics (which is 
the case of eqns [2]). The paradoxical implication 
of Boltzmann’s conjecture was pointed out by 
H Poincaré, who had proved that any isolated 
Hamiltonian system necessarily evolves towards an 
almost-recurrent dynamics. This is manifestly 
incompatible with the second law of thermody- 
namics, which implies that thermodynamic systems, 
in the absence of a supplied energy flux, have to 
evolve irreversibly towards their equilibrium state. 
In this perspective, the FPU numerical experiment 
was intended to test also if and how equilibrium is 
approached by a relatively large number of non- 
linearly coupled oscillators, obeying the classical 
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laws of Newtonian mechanics. Furthermore, the 
measurement of the time interval needed for 
approaching the equilibrium state, that is, the 
*relaxation time" of the chain of oscillators, would 
have provided an indirect determination of thermal 
conductivity. In fact, according to elementary kinetic 
theory, the relaxation time, 7,, represents an 
estimate of the timescale of energy exchanges inside 
the crystal: Debye's argument predicts that thermal 
conductivity « is proportional to the specific heat at 
constant volume of the crystal, C,, and inversely 
proportional to 7,, in formulas K «x C, /7;. 

Fermi, Pasta, and Ulam considered relatively short 
chains, up to 64 oscillators — a size that already 
challenged the limits of the computational power of 
MANIAC 1. They imposed fixed boundary condi- 
tions (i.e. the particles at the chain boundaries 
interact with infinite mass walls) and the energy was 
initially stored just in one of the long-wavelength 
oscillatory modes. 

A very surprising and unexpected scenario 
showed up. Contrary to any intuition, the energy 
did not flow to the higher modes, but was 
exchanged only among a small number of long- 
wavelength modes, before flowing back almost 
exactly to the initial state, thus yielding a recurrent 
behavior. 

Although nonlinearities were at work, neither a 
tendency towards thermalization, nor a mixing rate 
of the energy could be identified. The dynamics 
exhibited regular features very close to those of an 
integrable system. 

Fermi guessed that they were facing a very 
important result, but he was also quite disappointed 
by the difficulties in finding a convincing explana- 
tion. This lacking, he had decided not to publish the 
results in a scientific review, which remained 
confined into a Los Alamos report for almost one 
decade. In fact, he died in 1955, the same year of 
publication of the report. 

The results were finally published in 1965, in a 
volume containing his collected papers (Fermi et al. 
1965), and they immediately raised a renewed 
interest in the scientific community. Despite the 
failure in answering all the questions that had been 
raised, the FPU numerical experiment represents a 
crucial scientific achievement, which determined 
many subsequent scientific progresses. The implica- 
tions about nonequilibrium will be widely dis- 
cussed in the following sections. Here, we want to 
conclude by mentioning the important develop- 
ments, inspired by the FPU experiment, that led to 
the discovery of solitons by Zabusky and Kruskal 
in 196S. 


Slow and Fast Energy Relaxation 
in Nonlinear Chains 


The results of the FPU numerical experiment 
indicate that the energy initially supplied to long- 
wavelength oscillatory (Fourier) modes remains 
localized for a very long time in a small subset of 
long-wavelength modes. This time can be exceed- 
ingly larger than any typical timescale of the model 
(e.g., w', i.e., the inverse of the harmonic frequency 
in [1]). An explanation of this apparently bizarre 
scenario has been tackled by combining theoretical 
approaches with numerical studies. A complete 
account of the many contributions in this direction 
being beyond the scope of this text, we shall 
summarize the two main lines along which this 
problem has been considered. 


The Resonance-Overlap Criterion 


The almost-recurrent behavior of single-mode exci- 
tations studied in the FPU experiment can be 
explained by the resonance-overlap criterion, intro- 
duced in 1959 by the Russian scientist B Chirikov. 
Moreover, this criterion provides a quantitative 
estimate of the value of the energy density, above 
which the regular motion observed in the FPU 
experiment should be definitely lost. 

In order to provide the reader with an illustration 
of this criterion, we have to introduce a few simple 
mathematical ingredients. 

The Hamiltonian [1] can be rewritten in terms of 
linear normal Fourier coordinates, (O,(t), P;(t)), as 
follows: 


1 ` JAJ 
H = z2 (Pi T wt OL) + aV3({Qz}) 
+ 8Va(1Ox]) [3] 


Here, we have used the shorthand notation V,,({Q,}) 
for the lengthy explicit expressions, in the new set of 
coordinates, of the nonlinear potentials of [1]. 

Without prejudice of generality, we can impose 
periodic boundary conditions to the FPU chain: the 
frequency of the kth normal mode is given by the 
expression w = 2 sin(zk/N). The coupling constants 
a and 8 control the energy exchange among the 
normal modes, due to nonlinear interactions. 

For the sake of space, we give here a brief sketch 
of Chirikov’s criterion for the FPU (-model (this 
model amounts to take o — 0 in [3], i.e., to exclude 
the cubic part of the nonlinear potential). 

By making reference to the initial conditions of 
the FPU experiment, we can consider a single 
excited mode, so that the Hamiltonian [3] can be 
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approximated by the expression in action-angle 
variables 


| 3 
H = Ho + 8H1 ~ wy, A [4] 


Here, Je —w,Q; is the action variable. In practice, 
this amounts to approximate the original Hamilto- 
nian by the sum of the harmonic and nonlinear self- 
energy of the initially excited mode. In this frame- 
work, Ho and Hı are the unperturbed (integrable) 
Hamiltonian and the perturbation, respectively. 
Indeed, if the energy is initially attributed to mode 
k, the following relations hold: w,J, ~ Ho ~ E. By 
the approximated Hamiltonian [4], one can com- 
pute the nonlinear correction to the linear frequency 
wk, giving the renormalized frequency w: 


| OH 8 


Li =i + wide = we + a [5] 
For N > k one has 
BHok 
Ll N2 (6) 


The distance between two primary resonances, in 
the harmonic limit, is given by the expression 


Aw, = wai — wk RENT 7] 


Consistently with [6], the last approximation is valid 
only for small wave number (k < N), that is, long- 
wavelength modes. 

The “resonance overlap” criterion amounts to 
compare this distance with the frequency shift. In 
formulas: 


QQ, X Aw, [8] 


This equation allows to obtain also an estimate of 
the “critical” energy density, éc, above which size- 
able chaotic regions develop and a fast diffusion 
takes place in phase space: 


_ (Ho) s 
u GN) Pk " 


with & — O(1) «& N. Below e, primary resonances 
are weakly coupled and determine a slow-relaxation 
process to energy equipartition. Above e., due to 
"primary resonance" overlap, fast relaxation to 
equipartition sets in (Izrailev and Chirikov 1966). 
This prediction was verified numerically later by 
Chirikov et al. (1973). The presence of a critical 
energy density can be tested by measuring the 
evolution of the finite time-averaged quantity 
E,(t) =t fj Ej(r)dr, where E, —(P2 --w202)/2 is 
the harmonic energy of the kth mode. For energy 
densities much smaller than «,E,(t) exhibits an 


extremely slow relaxation towards the equipartition 
condition, E, = constant. Conversely, for € > e, such 
a condition is rapidly approached on a relatively 
short timescale. The slow relaxation below e, can be 
traced back to the overlap of higher-order reso- 
nances: its typical timescale has been found to be 
inversely proportional to a power of the energy 
density (Shepelyansky 1997). 


Energy-Equipartition Thresholds 


The first paper reporting evidence of the existence of 
an energy threshold in chains of coupled anharmo- 
nic oscillators had already been published in 1970 
by Bocchieri et al. (1970). This pioneering numerical 
experiment concerned a chain of oscillators coupled 
through a Lennard-Jones interatomic potential. The 
Italian group observed an energy threshold, separat- 
ing a high-energy thermalized regime from a regular 
dynamics regime at low energies (like the one 
observed by Fermi, Pasta, and Ulam). The main 
point raised by this experiment concerns the 
consequences on ergodic theory: the ordered motion 
observed in the low-energy regime seems to violate 
ergodicity, although the model is known to be 
chaotic at any energy. 

This is quite a delicate and widely debated issue 
for its statistical implications. Actually, as we have 
mentioned in the previous section, also Fermi, Pasta, 
and Ulam expected that a nonlinear dynamical 
system, made of a large number of degrees of 
freedom, should naturally evolve towards equili- 
brium. Further confirmations to the seminal paper 
by Bocchieri and co-workers came from more 
refined numerical experiments, showing that, for 
sufficiently high energies, regular behaviors disap- 
pear, while equipartition among the Fourier modes 
sets in rapidly. Later on, the presence of the energy 
threshold was characterized by introducing an 
appropriate entropy, $ — —5',p,lnp, with p= 
(E,(t)/ E), which counts the number of effective 
Fourier modes involved in the dynamics: at equi- 
partition, this entropy is maximal (Livi et al. 1985). 

Nowadays, we know that the approach to 
equipartition below and above the energy threshold 
is a matter of timescales, which turn out to be very 
different in the two regimes. For instance, the 
analytic estimate of the maximum Lyapunov expo- 
nent À of the FPU 5-model (Casetti et al. 1995) has 
definitely pointed out that there is a threshold value 
of the energy density, er, at which its dependence on 
c changes drastically: 


if € > er; 


| [10] 
€ if e «& er. 
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This implies that the typical relaxation time, that is 
A, may become exceedingly large for very small 
values of € below er. It is worth stressing that this 
result holds in the thermodynamic limit, thus indicat- 
ing that the presence of er is statistically relevant. 

A more controversial scenario emerges from the 
studies of the relaxation. dynamics for specific 
classes of initial conditions. When a few long- 
wavelength modes are initially excited, regular 
motion may persist over times much longer than 
A (De Luca et al. 1995). The excitation of small- 
wavelength modes yields an even more complex 
scenario: solitary wave dynamics is observed, fol- 
lowed by slow relaxation to equipartition. It is also 
worth mentioning that some regular features of the 
dynamics persist even at high energies. As we shall 
discuss in the section “Heat transport," such 
regularities still play a crucial role in determining 
energy transport mechanisms, although they do not 
affect significantly the equilibrium statistical proper- 
ties of the FPU model at high energies. 


The Generalized Fluctuation-Dissipation 
Theorem 


Another fundamental problem of nonequilibrium 
statistical mechanics concerns the possibility of 
establishing a fluctuation-dissipation theorem, gen- 
eralizing the relation valid for equilibrium condi- 
tions. In fact, on this basis one might develop a 
large-deviation formalism, aiming at the identifica- 
tion of an explicit nonequilibrium statistical mea- 
sure, analogous to the equilibrium Boltzmann-Gibbs 
measure. Recently, some relevant progresses in this 
direction have been made. 

A crucial numerical experiment, which attracted 
the attention on the problem of formulating a 
generalized fluctuation-dissipation relation for sta- 
tionary flows, was performed at the beginning of the 
1990s (Evans et al. 1993). Stationary conditions for 
momentum transport were obtained in the shear 
flow of a fluid contained between moving walls. The 
reversibility of the microscopic dynamics yields the 
heuristic fluctuation relation: 


lg cR) oa 11] 
t Pr(R; = —A) 
where Pr(R, — A) is the probability that the average 
entropy production rate, R;, along a trajectory 
segment of duration t, takes the value A. For 
sufficiently large values of f£, this relation was 
confirmed by numerical analysis. 
Gallavotti and Cohen (1995a,b) proved a theo- 
rem meant to put on a rigorous mathematical 


basis eqn [11], that is, the proposed extension to 
nonequilibrium steady states of the equilibrium 
fluctuation-dissipation theorem. This theorem 
concerns the phase-space contraction rate of the 
dynamics, which equals the entropy production 
rate in the case of particle systems, whose internal 
energy is a constant of the motion. The proof of 
the theorem is based on restrictive hypotheses, 
which include the existence of an average non- 
vanishing phase-space contraction rate, the time- 
reversal invariance of the dynamics and a strong 
form of chaos (the dynamics is assumed to be of 
the Anosov type, that is, smooth and uniformly 
hyperbolic). Nonetheless, the prediction of the 
theorem, that is, 


1, MO) _ py, 
A Tp D(o)p [12] 


is expected to hold much more generally. Here II,(p) 
is the probability that a fluctuation variable takes 
the value p. The theorem proved by Gallavotti and 
Cohen states that IL(p) has to satisfy the large 
deviation relation [12], where o is the average 
phase-space contraction rate over a trajectory seg- 
ment of duration ¢ and D is a suitable constant. It 
must be pointed out that the rigorous derivation of 
this relation provided strong motivations for inves- 
tigating its validity and generality in many other 
contexts. The first numerical experiment, where 
almost all the constituent hypotheses of the Gallavotti- 
Cohen theorem were satisfied, was performed by 
Bonetto et al. (1997). They studied a Lorentz gas 
(massive pointlike noninteracting particles bouncing 
elastically on circular scatterers displaced on a 
regular lattice without free horizon) of charged 
particles moving in an uniform external electric 
field. Numerical simulations were found to be in 
very good agreement with [11] and [12] (which, in 
this case, refer to the same quantity). One further test 
of the fluctuation-dissipation relation was later 
performed for a different setup (Lepri et al. 1998). 
The FPU 5-model is put in contact at its boundaries 
with thermal heat baths of different temperatures T+ 
and T (T, » T ). Numerical simulations have been 
performed for sufficiently large applied thermal 
gradients, which guarantee sizeable effects of fluc- 
tuations, suitable for verifying a relation like [11]. It 
is worth noticing that many of the constituent 
hypotheses of the Gallavotti-Cohen theorem are 
not valid for this setup, but eqn [12] is still expected 
to hold, although in this case it does not refer to the 
entropy production rate. Nonetheless, the extension 
[11] of the fluctuation-dissipation theorem can be 
tested, thanks to the following useful relation, 
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between the heat flux ; and the entropy production 
rates, C, at the chain boundaries: 


c)e«-(r--7-) m 


This can be interpreted as a balance relation for the 
global entropy production. In fact, according to the 
principles of irreversible thermodynamics, the local 
rate of entropy production c in the bulk is given by 


ix) — ias Cris) [14] 


By integrating this equation, one straightforwardly 
obtains the previous one, which then applies to the 
entropy production from the heat baths. Careful 
numerical simulations show that stationary condi- 
tions are found to hold over a wide range of 
temperatures and gradients. Equation [13] indicates 
that the heat flux is equivalent to the entropy 
production rate, apart from a multiplicative con- 
stant which depends on the amplitude of the applied 
field. 

Let us define the finite-time average of the global 
heat flux 


bai sf... 
hx» zl drj;(7) [15] 


The normalization of this quantity can be obtained 
by computing the asymptotic average value 


Ix = lim Ji [16] 


The quantity of statistical interest is the normalized 
finite-time average global heat flux 


Accordingly, the fluctuation-dissipation relation in 
this case takes the form: 


[17] 


The conjecture that such a relation might be valid in 
this case has been confirmed by numerical analysis. 
It is worth stressing that, in this out-of-equilibrium 
setup, the probability distribution, P.(z), is not 
Gaussian and exhibits a peculiar asymmetric shape. 
Nonetheless, for increasing values of 7, the asym- 
metry progressively reduces, while P,;(z) approaches 
a Gaussian shape. This observation indicates that, in 
this case, large fluctuations deviate from the typical 
statistics of independent events. 

It should be mentioned that generalized fluctuation- 
dissipation relations, like those discussed in this 


section, have been successfully checked in many other 
situations, where the hypotheses of the Gallavotti- 
Cohen theorem did not apply. The “robustness” of 
relations such as [11] and [12] indicates that a more 
general theory may be possible. 


Heat Transport 


The validity of Debye's conjecture about the 
necessity of nonlinear forces for obtaining a finite 
heat conductivity in crystals still remained an open 
problem after the unsuccessful FPU numerical 
experiment. The setup, described in the previous 
section for testing the generalized fluctuation- 
dissipation relation in the FPU chain, can be used 
also for tackling the verification of this conjecture. 
Actually, the thermal conductivity, x, of a chain of 
oscillators can be measured from the Fourier's law 


Jo = -&VT(x) [19] 


where Jo is the heat current and VT(x) is the 
temperature gradient. 

This problem was solved analytically for a chain 
of N harmonic oscillators (Rieder et al. 1967). The 
bulk of the chain is found to reach thermal 
equilibrium conditions at the average temperature 
T =(T, + T_)/2, corresponding to a constant tem- 
perature profile. Only at the chain boundaries the 
harmonic chain exhibits a steep temperature gra- 
dient. This implies that the heat current is propor- 
tional to the temperature difference, rather than to 
the temperature gradient, thus violating Fourier’s 
law. Accordingly, a harmonic chain, made of N 
oscillators, in contact with two heat reservoirs at 
different temperatures, exhibits anomalous trans- 
port properties and the effective thermal conduc- 
tivity is found to diverge in the infinite-chain limit 
as k ~ N. This peculiar behavior is a consequence 
of the integrability of the harmonic chain 
dynamics. Actually, the Fourier modes propagate 
with finite velocity through the harmonic chain, so 
that any energy injected from the hot reservoir 
flows ballistically to the cold one, rather than 
diffusing, as required for the validity of [19]. It is 
worth stressing that any integrable system should 
exhibit a similar scenario. This is the case of the 
equal-mass hard sphere gas in one dimension and 
of the Toda chain, where the harmonic potential 
(w?/2)(qiz1 — qi? is replaced by the nonlinear 
expression 


aexp|-b(qi.1 — qi)] 


In the former case, integrability and ballistic 
propagation are straightforward consequences of 
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the conservation laws, inherent elastic collisions 
between hard spheres. In the latter model, the 
normal nonlinear modes, called “Toda solitons,” 
are responsible for such anomalous behavior. 

Debye’s conjecture should be modified accord- 
ingly: nonintegrability of the equations of motion 
has to be invoked as a necessary property for 
explaining heat transport in real solids. Let us 
observe that the FPU model is known not to be 
integrable and it is expected to be a good candidate 
for confirming Debye’s conjecture, at least in its 
fully chaotic regime. Careful and extended numer- 
ical simulations have shown that the FPU chain 
maintains anomalous properties (Lepri et al. 1997). 
In particular, the thermal conductivity, x, is found 
to diverge in the infinite chain limit as 


Ko N? [20] 


with y 2/5. This value agrees with independent 
analytic estimates (e.g., see Lepri et al. (2003)), 
although renormalization arguments indicate that 
one should rather find y=1/3 (Narayan and 
Ramaswamy 2002). This discrepancy could be due 
to the peculiar features associated with the presence 
of a quartic nonlinearity in the FPU problem and 
also to the fact that in the FPU chain heat can be 
transported only through longitudinal oscillations. 
Anyway, this is still an open problem, which 
requires further theoretical advances to be solved. 

In a more general perspective, the main outcome 
of these numerical studies indicates that a power- 
law divergence like [20] is found in all one- 
dimensional nonintegrable models. This general 
feature must be attributed to the combined effect 
of low-space dimensionality, with energy and 
momentum conservation. In such a situation, 
fluctuations are strongly constrained, so that the 
evolution of long-wavelength hydrodynamic modes 
is not sufficiently damped, to be ruled by diffusion 
(which is a necessary ingredient for the validity of 
[19]). It must be stressed that these numerical 
investigations have strongly revived the interest for 
this problem. In particular, they have also stimu- 
lated new theoretical efforts for explaining the 
power-law divergence of transport coefficients in 
d=1. One of the main achievements of these 
theoretical approaches is that the power-law 
divergence turns to a logarithmic one in d=2, 
while the divergence should disappear in d > 3. 
Despite the difficulty of performing the necessary 
large-scale simulations for such systems in d > 1, it 
seems that numerics essentially agree with such 
predictions. 

One can find normal transport properties even 
in d=1, if suitable models are considered. For 


instance, momentum conservation can be broken 
by adding to the Hamiltonian [1] a local interac- 
tion potential, U(q;i) which breaks translation 
invariance, thus restoring finite heat conductivity 
(e.g., see Casati et al. 1984). The exception to this 
case is the harmonic chain with the addition of a 
local harmonic potential: in this case the dynamics 
is still integrable and there are as many conserved 
quantities as degrees of freedom. A further pecu- 
liar case is represented by the rotator model in 
d=1, which is known to be nonintegrable. Its 
Hamiltonian contains the interaction potential 
e[1 — cos(qgj+1 — q;)], replacing the algebraic poten- 
tials of the FPU chain. Anyway, such a Hamilto- 
nian still guarantees momentum conservation, 
since the nearest-neighbor form of the interaction 
is maintained. Notice that, for small oscillations 
around the equilibrium position, also the rotator 
potential admits a Taylor-series expansion, whose 
first three terms correspond to quadratic, cubic, 
and quartic contributions, as in the FPU chain. 
Nonetheless, at variance with the FPU problem, 
the potential of the rotator model is bounded also 
from above. Numerical investigations (Giardina 
et al. 2000) have shown that for any finite energy 
density and for a sufficiently long finite time, 
some previously oscillating rotators start to rotate, 
due to local energy fluctuations, that allow to 
overtake the potential barrier. These dynamical 
configurations typically appear in the form of 
spatially localized, synchronous rotating clusters. 
Their time evolution is characterized by an 
intermittent behavior: they are eventually reab- 
sorbed by lattice fluctuations and may reappear 
afterwards at other lattice positions. In this way 
they play the role of scattering centers for 
hydrodynamic modes. It must be pointed out that 
such a qualitative argument is not sufficient for 
explaining the onset of a genuine diffusive beha- 
vior, compatible with the validity of Fourier’s law. 
A hydrodynamic theory, still to be developed, 
could provide a more convincing insight on these 
results. 

It is worth concluding this section by mentioning 
that the overall scenario described above is con- 
firmed by numerical studies, relying upon a different 
approach, based on equilibrium measurements. 
Actually, the linear response theory by Green and 
Kubo (see Kubo (1985)) provides an alternative, but 
essentially equivalent, definition of the thermal 
conductivity, according to the expression 


1 i jJ" 
(= li im — T 2 
Kg T? im sm ai ar I(r )J(0)) P 
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The crucial quantity to be computed numerically is 
the heat-flux time-correlation function Cj(7) — 
(J(7)/(0), where ( ) represents the thermodynamic 
equilibrium average. In practice, numerical simula- 
tions can be performed for a chain of N oscillators 
in contact with boundary heat reservoirs at the same 
temperature T — T, —T. . The presence of anom- 
alous transport coefficients can be singled out by 
analyzing the long-time behavior of Cj(7). It has to 
decay at least as 7 /! ^, with £ > 0 to yield a finite 
heat conductivity. In one-dimensional models exhi- 
biting the power-law divergence [20] one rather 


finds 
Ci(T) ~ go [22] 


where the positive exponent y is the same appear- 
ing in [20]. This relation between space and time 
exponents can be easily explained, by considering 
that space and time variables depend linearly on 
each other through a proportionality constant, 
which is the velocity of sound in the lattice. Since 
0<y< 1, the anomalous behavior observed in 
out-of-equilibrium conditions is recovered. 

One major problem in performing proper numer- 
ical studies concerns the control over finite-size 
effects, which demands a consistent increase of the 
integration time with the system size. This may 
yield very extended and expensive computations, 
mainly when very slow relaxation processes set in. 
This is the case of the low-energy regime originally 
studied by FPU in their pioneering computer 
simulations. Numerical analysis indicates that in 
this regime the expected behavior of Cj(7), reported 
in eqn [22], sets in after a crossover time te, which 
increases, for decreasing energy density e, as te z €>. 
This seems to be compatible with the studies 
described earlier. 

We conclude this section by pointing out that this 
result also contributes significantly to clarify one of 
the basic questions raised by the FPU numerical 
experiment. 


See also: Dynamical Systems and Thermodynamics; 
Ergodic Theory; Fourier Law; Gravitational N-Body 
Problem (Classical); Lyapunov Exponents and Strange 
Attractors; Nonequilibrium Statistical Mechanics: 
Dynamical Systems Approach. 
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Historical Background 
Ginzburg-Landau Equations 


Nonlinear Schrödinger (NLS) equations have 
become one of the most important nonlinear systems 
studied in mathematics and physics. Actually, one 
can find the essence of NLS equations in the early 
work of Ginzburg and Landau (1950) and Ginzburg 
(1956) in their study of the macroscopic theory of 
superconductivity, and also of Ginzburg and Pitaevskii 
(1958), who subsequently investigated the theory of 
superfluidity. 

By minimizing the free energy of a superconductor 
near the superconducting transition, Ginzburg and 
Landau arrived at what are now called the 
Ginzburg—Landau equations: 


1 " e 2 j j f ] 2 — 
= (—ibv - -A) vcovcBv|v-oO [1] 
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where a, 3 are phenomenological parameters, A the 
electromagnetic vector potential, and y* denotes 
complex conjugate of w. The first equation deter- 
mines the field y based on the applied magnetic 
field. The second equation provides the supercon- 
ducting current J. 

The equation describing the behavior of super- 
fluid helium near the transition point in the 
stationary case derived in Ginzburg and Pitaevskii 
(1958) is completely analogous to eqn [1] in the 
phenomenological theory of superconductivity. 

Equation [1] contains all the ingredients of the 
NLS equations which are discussed below. How- 
ever, it was not until the 1960s that the wide 
physical importance of NLS equation became 
evident. The next section discusses how the NLS 
equation historically first appeared in the context of 
nonlinear optics. 


Nonlinear Optics: Self-Focusing of Optical Beams 
in Nonlinear Media 


In the mid-1960s, Chiao et al. (1964) and Talanov 
(1964) investigated the conditions under which an 
electromagnetic beam can produce its own dielectric 


waveguide and propagate without spreading. This is 
a reflection of the phenomenon of self-focusing. In 
fact, self-focusing of optical beams may occur in 
materials whose dielectric constant increases with 
field intensity. In the general situation, a beam of 
uniform intensity in a dielectric broadens due to 
diffraction. However, the refractive index of many 
physically important materials (the so-called Kerr 
materials, such as silica) depends on the field 
intensity as follows: 


n = no 4- n3|E|. +--- 


If the term 2|E|* is large enough, the critical angle 
for total internal reflection at the beam's boundary 
can be greater than the angular divergence due to 
diffraction; thus, spreading does not occur as a 
result of diffraction. As a consequence, a beam 
above a certain critical power level is trapped and 
does not spread. 

In a remarkable contribution, Kelley (1965) 
observed, using computational methods (years 
before computational methods became easy to 
implement and, consequently, so popular) that 
when the self-focusing effect due to the increase in 
the nonlinear index is not compensated by diffrac- 
tion, there is a buildup in intensity of part of the 
beam as a function of the distance in the direction 
of propagation. Consequently, the intensity of the 
self-focused regions tended to become ‘“anoma- 
lously large," that is, a singularity appeared to 
develop. 

Consider as starting equation the electromagnetic 
wave equation in the presence of nonlinearities 
derived earlier by Chiao et al. (1964): 


V?E — 28 E= EI (E2E) — 0 [3] 


where ej |E « 1. One assumes a linearly polarized 
wave of frequency w, propagating along the z-axis, 
so that 


E= L(Ee ut) + c.c.)e 


where c.c. denotes complex conjugation, k = e ^ wt, 
the factor exp(ikz — wt) represents the propagating 
part, that is, the “carrier,” of the wave, and € is the 
slowly varying part. Substituting the above expres- 
sion for E into eqn [3], neglecting the third-harmonic 
term and the term ŻE from V*E (assuming it to be 
small), yields 

2*2 


Zik, + (03 + Of E+ k efe=0 [4 
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or, with a suitable rescaling of the dependent and 
independent variables (E — / ((3/4)&e; /«9) ^, 
z — 2kz), 


id. + Vib + 2y = 0 [5] 


which is the NLS equation in standard nondimen- 
sional form. 

It should be remarked here that the name NLS 
equation for equations of the form of [5] is natural 
due to the formal analogy with the Schrödinger 
equation in quantum mechanics: 


idu + V^v 4- V = 0 [6] 
If one sets V —2|u|* in eqn [6], the result is the NLS 


equation. In the context of quantum mechanics, a 
nonlinear potential arises in the “mean-field” 
description of interacting particles. 

Modifications of [6|] also arise as mean-field 
descriptions of Bose-Einstein condensates which is 
of keen interest in physics (see Pethick and Smith 
(2002) and references therein). The normalized 
equation Is 


iy — V^v + (Vy) + 2, v -0 [7 


where V is an external potential. This is generally 
referred to as the Gross-Pitaevskii equation. 

Talanov (1965) (see also Zakharov et al. (1971)) 
investigated the behavior of stationary light beams 
in a self-focusing nonlinear medium and found that 
for a purely cubic nonlinearity, “collapse” of the 
beam can take place. The proof that there is a 
singularity in eqn [5] is remarkably straightforward. 
This is discussed in the section “Wave collapse.” In 
order to avoid wave collapse, other physical effects 
(e.g., saturable nonlinearity or dissipation) are 
required. 


Universal Character of the NLS Equation 


It turns out that almost any dispersive, energy- 
preserving system gives rise, in an appropriate limit, 
to the NLS equation. For instance, one can derive 
the NLS from other physically significant equations 
such as the Klein-Gordon equation 


Hg — Uyy HU + kuw? = 0 
and the Korteweg-de Vries (KdV) equation 
Uu, + 6uus + ux, = 0 


€ 


Actually, the NLS equation provides a “canonical” 
description for the envelope dynamics of a quasi- 
monochromatic plane wave (the carrier wave) 
propagating in a weakly nonlinear dispersive med- 
ium when dissipative processes are negligible. 
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Indeed, consider a scalar nonlinear wave equation 
written symbolically as 


L(Q, V)u + G(u) 2 0 


where L is a linear differential operator with 
constant coefficients and G a nonlinear function 
of u and its derivatives. For a real, small- 
amplitude solution of magnitude « < 1, the non- 
linear effects can first be neglected, and the 
equation admits approximate monochromatic 
wave solutions 


u = eyel*»-! 4 cc, [8] 


with small amplitude eļ|. Substituting [8] into the 
linear equation, one can find that the frequency w 
and the wave vector k are related by the dispersion 
relation 


L(—iw,ik) = 0 
Let 
w = w(k) 


be one of the solutions of the previous equation. 
Suppose one is interested in a solution ~ which is 
not constant, but slowly varying in space and time. 
This has the interpretation of k having a “sideband” 
wave vector and w a “sideband” frequency. More 
precisely, restricting discussion, for simplicity, to the 
(1 + 1)-dimensional case, the slowly varying ampli- 
tude assumption corresponds to letting 


(x,t) = (X, T) = poe 


where X=ex and T=et. Note that K=ek and 
()— «v» are sometimes referred to as the sideband 
wave number and frequency, respectively, because 
they correspond to a deviation from the central 
wave number k and central frequency w. Looking at 
these deviations from the point of view of operators, 
whereby w — i+, k —^ —i0, and Q — iôr, K — —idy, 
one has 


Wrot ^v wW + EQ = w + 1eÓT 
bw Nd k -- eK E k = 1eOx 


Then w(k) can be expanded in a Taylor series 
around the central wave number as 


w(k — ieðx) ~ w(k) — iew ôx — CF a wes 
Therefore, 
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which shows that, to the leading order, 


: Qu / Ow 2 us" Q^ 
«(ort «4 dr. d 
In the moving frame £— X — w(k)T, r=eT = et, 
eqn [9] transforms to 


e c T T ve) =p 


which is the linear Schrödinger equation with the 
canonical w”(k)/2 coefficient. On the other hand, if 
one considers rather general conservative nonlinear 
wave problems with leading quadratic or cubic 
nonlinearity, asymptotic analysis (e.g., multiple 
scale analysis which yields the so-called Stokes- 
Poincaré frequency shift) shows that a wave solution 
of the form 


[9] 


u(x,t) = ew(r)el**— + c.c. 


with 7 =e7t has v(r) satisfying 
i28 x nyy = 0 (10) 


where the constant coefficient n depends on the 
particular equation under study. It should be 
remarked here that cubic nonlinearity yields an 
O(e*) contribution, which is balanced by a slow 
timescale of order €^. Putting the linear and non- 
linear effects together (1.e., eqns [9] and [10]) implies 
that an NLS equation of the form 


naturally arises. The NLS equation is viewed as a 
"universal" equation as it generically governs the 
slowly varying envelope of a monochromatic wave 
train (see also Benney and Newell (1969)). 


Physical Applications 


The nonlinear propagation of wave packets is 
governed by NLS-type systems in several different 
branches of scientific and technological applications, 
beyond what has been mentioned earlier. Some of 
these applications are discussed below. 


NLS equation in Water Waves 


The NLS equation in the context of small-amplitude 
water waves was derived by Zakharov (1968) 
(infinite depth) and Benney and Roskes (1969) 
(finite depth). The procedure for deriving the NLS 
equation from the Euler-Bernoulli equations of fluid 
dynamics in one horizontal direction will now be 
discussed, under the assumption of small-amplitude 


waves and deep water. The interested reader can 
also find the details of the derivation in Ablowitz 
and Clarkson (2006). The relevant equations are 


Qxx + Qzz = 0, —o6 < z « ena E) [11] 


Q;—0, Z- —oo [12] 


br +5 (62-62) - em — 0. z=en [13] 


Nt + ENx Px =z, Z= €T] [14] 


where @ is the velocity potential of an ideal 
(i.e., incompressible, irrotational, and inviscid) 
fluid, z(x,t) is the free surface of the fluid, which 
is to be found, in addition to ó(x, z; t). 

Equation [11] expresses the ideal nature of the 
fluid; the condition [12] expresses the requirement 
that there is no vertical flow at infinity; and eqn [13] 
is the Bernoulli equation of energy conservation. 
Finally, eqn [14] is a kinematic condition stating 
that no flow occurs transverse to the free surface. 

At the free boundary, for small amplitudes, one 
can expand ó = ó(t,x, er) for e < 1 as 

| (en) 

Q = ó(t, x, 0) 3 enós(t, x, 0) d 03 Packt, x; 0) aq 
and similarly for the derivatives. Second, one 
introduces slow temporal and spatial scales (one 
expects the slowly varying envelope of the wave to 
depend on slow variables X — ex, Z= ez, T = et). 
Finally, because of the quadratic nonlinearity one 
expects second harmonics to be generated; hence, 


dis (Agit + c.c.) E e ( Aze zlii Tec. 2 
p (Be'? 4 cum.) t c(B;e^'? + c.c. + 7) 


where A, A», ó depend on X, Z, T and B, Bo, ij 
depend on X, T (ó and # are mean contributions, 
which are real) and O = kx — wt with the dispersion 
relation u^ — g|k|. Substituting this ansatz into the 
equations, one obtains from the order-c&? terms 


v^ 2k4 
ZiwA. =|“ An+—lArPA] =6 1 
IWwA (3 ae |A| [15] 


where vg =w'(k) = g/2w is the group velocity and the 
new variables r=eT, €= X =T. 

Equation [15] is the typical formulation of the 
(1 + 1)-dimensional NLS equation found in water 
wave theory for large depth. 

In the section “NLS in nonlinear optics,” a 
special solution to (a rescaled version of) eqn [15], 
namely a soliton solution, is discussed in the 


context of nonlinear optics. It should be 
remarked here that the coefficients of both terms 
Age and IA! A have the same sign. This is necessary 
for a decaying soliton solution to exist (see, e.g., 
Lighthill (1965)). 


NLS in Nonlinear Optics 


The NLS equation also describes self-compression 
and self-modulation of electromagnetic wave pack- 
ets in weakly nonlinear media. Hasegawa and 
Tappert (1973a, b) first derived the NLS equation 
in the context of fiber optics. Light-wave propaga- 
tion in a fiber is mainly affected by: (1) group 
velocity dispersion (GVD), that is, the frequency 
dependence of the group velocity originating from 
the refractive index of the fiber and (2) fiber 
nonlinearity (the so-called Kerr effect), originating 
from the dependence of the refractive index on the 
intensity of the optical pulse. In the presence of 
GVD and Kerr nonlinearity, the refractive index is 
expressed as 


n(w, E) = no(w) + n3 |El^ [16] 


where w and E represent the frequency and 
electric field of the light wave, respectively, p(w) 
is the frequency-dependent linear refractive index, 
and the constant m2, referred to as the Kerr 
coefficient, is “small” but can have significant 
impact since the nonlinear effects accumulate over 
long distances. Normally, the electric field is 
modulated into a slowly varying amplitude of a 
carrier wave: 


E(z,t) = £(z, tJi) + c c. [17] 


where z denotes the distance along the fiber, ¢ the 
time, bg — ko(wo) the wave number, wo the fre- 
quency, and £(z,t) the envelope of the electromag- 
netic field. 

A Taylor series expansion of the dispersion 
relation (see also the section “Universal character 
of the NLS equation") 


k(w, E) == (no(w) + mE’) 


around the carrier frequency w=wp» yields 
p" 

= " 
EI" [18] 


k — ko = k'(wg)(w — wo) + k 


wom 


— wo) 


-+ 

C 
where the prime represents derivative with respect to 
w and ko — k(wo). Replacing k — kọ and w — wọ by 
their Fourier operator equivalents, id, and 10, resp., 
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using k — ko —(w/c)no(w) and 
operate on £ yields 


7 2 
(Z2 + kl (uo) ar) EAE, v|£^£ —-0 [19] 


letting eqn [18] 


dz 9 2 82 


where V = wor? / CA, with Aeff being the effective 
cross-section area of the fiber (the factor 1/A,g 
comes from a more detailed derivation which takes 
into account the finite size of the fiber; the factor 
1/A,g is needed in order to account for the variation 
of field intensity in the cross section of the fiber). 
Note that k5(wo)— 1/v,, where vg represents the 
group velocity of the wave train. Introducing dimen- 
sionless variables t =tret/t,, Z —z/z,,q—€/./P, 
yields the NLS equation 


.Óq | sgn(-k[(uo)) Pa 

Oz! 2 Ot"? 
where t,,P, are the characteristic time and power, 
respectively, and £4-—£— ko(wo)z—t—z/vg, %= 
1/vP,, with the constraint that the “nonlinear 
length” is balanced by the linear dispersion time, 
that is, t, = (z,| — k”(wo)|)'*. 

There are two cases of physical interest depending 
on the sign of kg. The so-called focusing case occurs 
when kọ < 0; this is called “anomalous” dispersion. 
The defocusing case obtains when the dispersion is 
“normal”: kj > 0. 

Now write eqn [20] in the form 


+\q’q=0 [20] 


ig; + qax t 2|g|^q = 0 [21] 


with + corresponding to the focusing (+) and 
defocusing (—) case, respectively. The focusing NLS 
equation admits special solutions called "bright" 
solitons (solutions that are traveling localized 
"humps"). A pure one-soliton solution in the 
focusing (+) case has the form 


q(x,t) = gsech[n(x + 2£&t — xo)]e '? [22] 


where O —£x + (& — 7*)t + Og. The parameters £ 
and 7 are such that A—£/2 + in/2 is an eigenvalue 
from the inverse scattering transform analysis. 

The defocusing (—) NLS equation does not admit 
solitons that decay at infinity. However, it does admit 
soliton solutions which have a nontrivial background 
intensity (called *dark" and *gray" solitons). A dark- 
soliton solution has the form 


q(x,t) = 7 tanh(nx) g AM [23] 


Note that q — +n as x — +œ. A gray-soliton 
solution is 


L 
q(x,t) = n1 — B? sech? (nB (x — xo))| eic) [24] 
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with 
(x,t) =— n (2 — B*)t+nv1 — Bex 


B h(nB 
+ tan (ER + 090 


v1 — B? 
and |B| < 1. Note that as B — 1^, the gray soliton 
becomes a dark soliton, taking ġo = —7/2. 


Recall that the solutions [23] and [24] can be 
allowed to travel uniformly by making a Galilean 
transformation, that is, taking into account that if 
q1(x,t) is a solution of [21], then so is 


q(x,t) = qi(x — vt, t) ex 


with k= —v and w= —k?/2. 

It should also be remarked that Ablowitz et al. 
(1997) have shown that, in quadratically nonlinear 
optical materials, more complicated NLS-type equa- 
tions arise. These equations are analogous to the 
finite-depth multidimensional nonlocal NLS-type 
systems derived in the context of water waves by 
Benney and Roskes (1967) and later by Davey and 
Stewartson (1974). 


Optical Communications 


Hasegawa and Tappert (1973) first suggested using 
solitons as the “bit” format for transmission of 
information in optical fiber systems. Motivated by 
this, in 1980, scientists at Bell Laboratories observed 
solitons (described by the NLS equation) in optical 
fibers (Mollenauer et al. 1980). The development of 
optical amplifiers (erbium-doped amplifiers) in the 
mid-1980s provided a mechanism to compensate 
fiber loss, and this permitted the transmission of 
information entirely optically over long distances. 
With damping and amplification included (see, e.g., 
Hasegawa and Kodama (1995)), the NLS equation 
[20] takes the form 


Og | sen(—K5(wo)) Pq 
Oz 2 Ot 


where g(z) =a exp( —2.T'z/z,),0 < z < Za, and peri- 
odically extended thereafter, and a7 is determined by 


-glalq-0 [25] 


1 (^ 
€g = g(z/z,)dz = 1 
Za JO 


with z,—l,/z.,l, being the amplifier length. 
Remarkably, asymptotic analysis (za « 1) shows 
that, to leading order, g(z,t) still satisfies the NLS 
equation [20]. 

Amplifiers, however, introduce small amounts of 
noise to the system, which causes the temporal 
position of the soliton to fluctuate (cf. Gordon and 
Haus (1986)) and thus limits the distance signals can 


be reliably transmitted to. Soliton control mechan- 
isms were introduced in the early 1990s in order to 
deal with these difficulties (cf. Mecozzi et al. (1991) 
and Kodama and Hasegawa (1992)). 

By the mid-1990s, the development of all optical 
transmission systems began to take great advantage 
of wavelength-division-multiplexing (WDM), that 
is, the simultaneous transmission of multiple 
signals in different frequency (or equivalently 
wavelength) *channels" (Hasegawa 2000). How- 
ever, it was found that a serious problem affected 
WDM systems. Namely, the interactions of soli- 
tons traveling at different velocities cause resonant 
amplifier-induced instabilities in adjacent fre- 
quency channels (four-wave mixing (Mamyshev 
and Mollenauer 1996, Ablowitz et al. 1996)). In 
order to avoid these instabilities, researchers 
developed and analyzed dispersion-managed (DM) 
transmission systems (cf. Hasegawa (2000)). In a 
DM transmission system, the fiber is composed of 
alternating sections of positive (normal) and 
negative (anomalous) dispersion fibers. The 
(dimensionless) NLS equation that governs this 
phenomenon is 

ðq , dz) ® 

id 4 SET ageda = 0 26] 
where d(z) is usually taken to be a periodic, large, 
rapidly varying function of the form d(z)= 6, + 
A(z), with |A(z)| > 1 and having zero average in 
the period z, (generally the same as that of the 
amplifier). In fact, asymptotic analysis of [26] 
yields a nonlocal NLS-type equation (Gabitov and 
Turitsyn 1996, Ablowitz and Biondini 1998). It has 
also been shown that eqn [26] admits various types 
of optical pulses, such as DM solitons (Ablowitz 
and Biondini 1998), and quasilinear modes (Ablowitz 
et al. 2001). 


NLS Equation in Other Settings 


Many other interesting applications of the NLS 
equations exist in such different areas of physics as 
magnetic spin waves (see, e.g., the work by Zvezdin 
and Popkov (1983) and also by Kalinikos et al. 
(1997)), plasma physics (cf. the work by Zakharov 
(1972) on collapse of Langmuir waves), other areas 
of fluid dynamics, etc. (the interested reader can 
find an overview in the monograph by Ablowitz 
(1981)). 


Mathematical Framework 


Mathematically, the NLS equation had attained 
broad significance since it is integrable via 


inverse-scattering transform (IST), admits multisoliton 
solutions, has an infinite number of conserved 
quantities, and possesses many other interesting 
properties. Some of these are discussed below. 


The Inverse-Scattering Transform 


The IST method allows one to linearize a large class 
of nonlinear evolution equations and can be con- 
sidered as a nonlinear version of the Fourier trans- 
form. An essential prerequisite of IST method is the 
association of the nonlinear evolution equation with 
a pair of linear problems (Lax pair), a linear 
eigenvalue problem, and a second associated linear 
problem, such that the given equation results as a 
compatibility condition between them. A key 
research breakthrough on NLS systems appeared in 
1972, in the papers of Zakharov and Shabat (1972, 
1973), who first analyzed the scalar NLS equation 
in the form 


dr = qxx c 2|al^q [27] 


(+ correspond to the focusing/defocusing case, 
respectively) and found the associated Lax pair 


_({-ik q 
ty = fe i» [28] 
pm ( ue uud * d b. 3" |29] 
+2kq* F ig} —2ik* +ilq| 


where v(x,t) is a two-component vector. The 
compatibility of [28] and [29] yields eqn [27], 
assuming that the eigenvalue parameter k is 
constant in time (so that [27] is often said to be 
Isospectral). 

The solution of the initial-value problem of a 
nonlinear evolution equation by IST proceeds in 
three steps, as follows: 


1. the forward problem — the transformation of the 
initial data from the original “physical” variables 
to the transformed “scattering” variables; 

2. time dependence — the evolution of the trans- 
formed data according to simple, explicitly 
solvable evolution equations; and 

3. the inverse problem — the recovery of the evolved 
solution in the original variables from the 
evolved solution in the transformed variables. 


The implementation of steps 1-3 described above is 
more concretely carried out as follows. The initial 
(Cauchy) datum q(x,0) for eqn [27] is mapped into 
scattering data S(k, 0) (comprising, in general, discrete 
eigenvalues and associated normalization constants, 
and reflection coefficients) by means of eqn [28]. The 
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data S(k, 0) are evolved via eqn [29] to get S(k, t) at an 
arbitrary time t>0. Finally, by employing the 
methods of inverse scattering, eqn [28] allows one to 
reconstruct the evolved solution g(x, t) from S(k, t). 

One can easily note the “formal” resemblance to 
the well-known method of Fourier transform for 
linear differential equations. 

There is considerable literature on the subject and 
the interested reader is encouraged to consult, for 
instance, some of the following references: Ablowitz 
and Segur (1981), Calogero and Degasperis (1982), 
Novikov et al. (1984), Ablowitz and Clarkson 
(1991), Ablowitz et al. (2004). 


Linear Stability Analysis 


Consider a special solution of eqn [27] in the 
focusing (+sign) case: q=a exp(-2ia^t). If this 
solution is perturbed as 

q(x,t) = ae? (1 + e(x, t)) 


where |e| « 1, it is found that e satisfies the 
condition 


ie; = €x + 2a^ (e 4- €*) 


On the periodic spatial domain 0 < x < L,e has the 
Fourier expansion 


OG 
e(x,t) = Y a(r 
= 
where 
Ann 
Hn = UL [30] 
Assuming a solution of the form 
(2) a 
ie B 
one finds that o, satisfies 
o? = u^ (p7 — 4a^) [31] 


It then turns out that when aL/z < n the system is 
unstable. Note that there are only a finite number of 
unstable modes (i.e., for fixed a, L, sufficiently high 
mode numbers n will not satisfy the above inequal- 
ity). In the context of water waves, this corresponds 
to the famous experimental and theoretical result by 
Benjamin and Feir that the Stoke's water wave is 
unstable. Later, Benney and Roskes (1969) showed 
that all periodic wave solutions of the generalized 
nonlocal NLS equation resulting from water waves 
in (2 + 1)-dimensions are unstable. Also, in (2 + 1)- 
dimensions soliton solutions are unstable to weak 
transverse modulations. 
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Wave Collapse 


The equation 
i + AY + ly -—0, x=(x,y)eER? [32] 


has the following conserved quantities: 


P= [Pax 


M = | WV uw dx 


1 
s=} DES 


that is, mass (power), momentum, and energy 
(Hamiltonian) are conserved. Remarkably, Talanov 
(1965) showed that eqn [32] satisfies the following 
equation: 


= 8H [33] 


where 
v= nz + y!) w^ dx dy 


Equation [33] is also known as the *virial" theorem. 
Hence, it follows that 


V — 4Ht? -- cit - c 


and if H < 0 initially, then a singularity in eqn [32] 
results since V must be positive. Actually, one can 
further show (see, e.g., C Sulem and P L Sulem 
(1999), and references therein) that there exists a 


time £* such that 
/ Vl? dx 


becomes infinite as t — t*, which in turn implies 
that y also becomes infinite as £ — £^ (blowup in 
finite time). 

Note also that for the more general equation 


id + Agu + |v['^v = 0, x c R7 


where A, is the d-dimensional Laplacian, one has 


the following types of solutions: 


e Supercritical (od > 2): the solution blows up. 

e Critical (od —2): blowup can occur or global 
solution can exist. 

e Subcritical (ad < 2): global solutions exist. 


Vector NLS Systems 


In many applications vector NLS (VNLS) systems are 
the key governing equations. Physically, the VNLS 


arise under conditions similar to those described by 
NLS with the additional proviso that there are 
multiple wave trains moving nearly with the same 
group velocities (Roskes 1976). Importantly, VNLS 
also models systems where the field has more than 
one component. For example, in optical fibers and 
waveguides, the propagating electric field has two 
components transverse to the direction of propaga- 
tion. The nondimensional system 


iq = a *2(W P am )a? [34a] 


ig?) = qR +2(laP +P) [84b 
is an asymptotic model which governs the propaga- 
tion of the electric field iñ a waveguide, where z is 
the normalized distance along the waveguide and x 
a transversal spatial coordinate. It was first exam- 
ined by Manakov (1974) (see also Anastassiou et al. 
(1999) and Soljačić et al. (2003)). Subsequently, this 
system was derived as a key model for light-wave 
propagation in optical fibers. More precisely, in 
optical fibers with constant birefringence 
(i.e., constant phase and group velocities as a 
function of distance) Menyuk (1987) has shown 
that the two polarization components of the 
electromagnetic field £ — (u,v)! which are orthogo- 
nal to the direction of propagation, z, along the fiber 
asymptotically satisfy the following nondimensional 
equations (assuming anomalous dispersion): 


i(u; + buy) +5Un + (Jul? + alv|*)u = 0 [35a] 


(v; — fn) -1vg + (alul --|v|)n — 0 — [3Sb] 


where ó represents the group velocity *mismatch" 
between the z, v components of the electromagnetic 
field, œ is a constant that depends on the polarization 
properties of the fiber, z the distance along the fiber, and 
t a retarded temporal frame. In deriving eqn [35], it is 
assumed that the electromagnetic field is slowly varying 
(as in the scalar problem); certain nonlinear (four-wave 
mixing) terms are neglected in the derivation of eqn 
[35], because the light wave is rapidly varying due to 
large, but constant, linear birefringence. In this context, 
birefringence means that the phase and group velocities 
of the electromagnetic wave in each polarization 
component are different. In a communications environ- 
ment, due to the distances involved (hundreds to 
thousands of kilometers), the polarization properties 
evolve rapidly and randomly as the light wave evolves 
along the propagation distance, z. Not only does the 
birefringence evolve, but it does so randomly, and on a 
scale much faster than the distances required for 


communication transmission (birefringence polariza- 
tion changes on a scale of 10-100 m). In this case, the 
relevant nonlinear equation is eqn [35] above, but with 
6 — 0 and a — 1. Indeed, this is the integrable VNLS 
equation first derived by Manakov (1974). 

It should be remarked that the VNLS equation 
[34] and its generalization to an arbitrary number of 
components, 


iq, = Axx + 2||gll^4 [36] 


where q is an N-component vector and ||-|| is the 
Euclidean norm, are integrable by the IST. One has 
to suitably extend the analysis discussed earlier in 
this article (cf. e.g., Ablowitz et al. (2004)). 


Discrete NLS Systems 


Both the NLS and the VNLS equations discussed 
above admit integrable discretizations which, 
besides being used as the basis for constructing 
numerical schemes for the continuous counterparts, 
also have physical applications as discrete systems. 

A natural discretization of NLS [27] is the 
following: 


.d 1 
| q; 4" = ji Mna Mu 2dn T -da-1) 


d lanl” (qn + Gn-1) [37] 


which is referred to as the integrable discrete NLS 
(IDNLS). It is an O(h7) finite-difference approxima- 
tion of [27] which is integrable via the IST and has 
soliton solutions on the infinite lattice (Ablowitz and 
Ladik 1975, 1976). Note that if the nonlinear term in 
[37] is changed to 2|g,|^4,, the equation, which is 
often called the discrete NLS (DNLS) equation, is 
apparently no longer integrable. It should be 
remarked that the (apparently nonintegrable) DNLS 
equation arises in many important physical contexts. 

Correspondingly, one can consider the discretiza- 
tion of VNLS given by the following system: 


.d 1 
lz 4" = h2 (Gn m Aq, + d, 1) 


E lanl (any t q,.1) [38] 


where q,, is an N-component vector. Equation [38] 
for q, — q(nb) in the limit h — 0,nb =x gives VNLS 
[36]. The discrete vector NLS system [38] is also 
integrable (Ablowitz ez al. 1999, Tsuchida et al. 
1999). The interested reader can find further details 
in Ablowitz et al. (2004). 


See also: Boundary-Value Problems for Integrable 
Equations; Dynamical Systems in Mathematical Physics: 
An Illustration from Water Waves; Evolution Equations: 
Linear and Nonlinear; Ginzburg-Landau Equation; 
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Integrable Systems and Discrete Geometry; Integrable 
Systems: Overview; Partial Differential Equations: Some 
Examples; Riemann-Hilbert Methods in Integrable 
Systems; Schródinger Operators. 
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Introduction 


The flow of a fluid, liquid or gas, is described by 
three conservation laws, the conserved physical 
quantities being the mass, the linear momentum, 
and the energy, and by constitutive equations. The 
constitutive equations are specific to each fluid, and 
link deformations to stresses. 
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A fluid is said to be Newtonian if it satisfies the 
simplest constitutive equation, which gives the stress 
tensor g as a linear function of the rate of 
deformation tensor D — (1/2)(Vu 4- Vul), namely 


c = (AtrD — p)I + 25D [1] 


where z is the fluid velocity, p is the hydrostatic 
pressure (p > 0), and A and 7) are the Lamé viscosity 
coefficients of the fluid, satisfying 7 > 0 and A+ 
2/3 > 0. The superscript T designates the transpose 
operation, the abbreviation “tr” the trace operator 
of a tensor, and J the unit tensor. Water and glycerin 
are examples of Newtonian liquids. 


Non-Newtonian fluids are fluids for which the 
behavior is not described by eqn [1]. Silicone oils, 
polymers (melted or in solution), egg yolks, and 
blood are examples of non-Newtonian liquids. 
Other examples include liquid crystals, rubbers, 
suspensions, paints, etc. 

In the following we shall first describe flows 
which show Newtonian or non-Newtonian 
behaviors. Then we shall describe the requirements 
a constitutive equation needs to satisfy to be 
considered, introducing the notions of continuum 
mechanics we need. After giving the most commonly 
used constitutive equations, we will give a few ideas 
about the mathematical study of the set of equa- 
tions, and their numerical study, in the particular 
case of viscoelastic fluids. 

Numerous kinds of materials are already known 
to exist, and more might exist in the future. This 
report, however, will be limited to the most 
commonly materials used nowadays, which are 
polymers, liquid crystals and polymeric liquids 
crystals, and paints. Moreover, we shall only 
consider isothermal flows, even though temperature 
might be an important parameter in experiments 
or in industry, because in particular most theoretical 
or numerical studies concern isothermal problems. 

Non-Newtonian fluids will always be liquids, and 
we shall use the terms liquid or fluid indifferently. 


Non-Newtonian Behaviors 


We describe a few experiments to show how 
differently both types of fluids, Newtonian or non- 
Newtonian, might react in some experimental 
situations. We also give some mechanical explana- 
tion when possible. 


Shear Thinning or Shear Thickening 


In a Poiseuille experiment, where a fluid flows in 
a tube under the action of a pressure drop, the 
volumetric flow rate of a Newtonian fluid is 
inversely proportional to the constant fluid viscosity. 
Under the same pressure-drop condition, a polymer 
melt flows much faster out of the tube, which means 
that there is a decreasing apparent viscosity with 
increasing shear rate: this is referred to as shear 
thinning effect. Other fluids might exhibit the 
opposite behavior and flow out of the tube more 
slowly: this is called the shear thickening effect. 


Rod Climbing 


When a rotating rod is inserted in a beaker filled with 
a Newtonian fluid, it is observed that the liquid near 
the rotating rod is pushed outwards by centrifugal 
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force and that a dip on the surface of the liquid near 
the rod results. On the contrary, if we make the same 
experiment with a polymer, the fluid climbs along the 
rod. Moreover, for comparable rotation speed, the 
difference in behaviors might be quantitatively con- 
siderable. This is explained by totally different 
pressure repartitions in both fluids, Newtonian or 
non-Newtonian: in particular, the pressure in the 
polymer along the rod is much larger than that along 
the beaker, so that this pressure difference fights the 
centrifugal force; this is in contrast with the situation 
in a Newtonian fluid. 


Extrudate Swell 


If a fluid is forced to flow from a large reservoir out 
of a circular tube of small diameter, the swell at the 
exit is much larger for a polymer solution than for a 
Newtonian fluid. A polymer flowing out of a die 
might also show a delayed die well, which means 
that the swell is not at the exit but on the jet at a 
certain distance of the exit. The explanation of this 
phenomenon is not unique: it is due partly to 
memory effects (the fluid remembers its former 
shape, the one in the reservoir), partly to the release 
of normal stresses, to interfacial forces, compressi- 
bility, viscous heating, and the complicated flow 
near the die exit. 


Difference in Normal Stresses 


In a shearing flow of a Newtonian fluid, the two 
normal stress differences are both zero, whereas for 
a polymer the first normal stress difference might be 
very large, the second one being nearly zero. These 
differences in stresses in shearing flow might be a 
partial answer to the extrudate swell and to rod 
climbing experienced by polymers. 


Presence of a Yield Stress 


Some materials, when subjected to shear stress, 
flow only after a critical value is attained. Such 
fluids are referred to as Bingham fluids: some 
cements, slurries, paints, and biological fluids 
might exhibit such a behavior. It is actually a 
well-known property of paints: if put in large 
quantities on a vertical wall, the paint will flow, 
whereas if put as a very thin film on the same wall, 
the paint will not flow, but stay in place, and dry to 
form a nice colored covering. 


Preferred Orientation of the Particles of Fluid 


Fluids with properties as above, Newtonian or 
non-Newtonian, are isotropic in nature, even though 
they are constituted of atoms, or of long chains of 
material. They are the same everywhere, optically, 
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magnetically, or electrically. Some fluids, liquid 
crystals, or polymeric liquid crystals in particular, 
have remarkable properties of nonanisotropy, being 
able to orient themselves, on average, along a 
particular direction: this is the nematic phase, which 
is used in many devices (screens for clocks, hand 
calculators, and cell phones), because the average 
orientation may be changed by applying an electric 
field. Other phases of liquid crystals include smectic 
A, C, and C' phases, where one sees a preferred 
orientation (tilted for C phases) of the fluid, and also 
a layer-like structure. As an example, let us mention 
discotic nematic liquid crystals, which are precursors 
for carbon-based materials, such as fibers, compo- 
sites, and films, which possess excellent mechanical 
and thermal properties. Sails for race sailing boats are 
made of Kevlar, which is one of these new materials 
with remarkable properties. 


Modeling 


The flowing fluid will be described by its (Euler- 
ian) velocity at time ¢ and position x, say u(x,t), 
for x belonging to the domain of the flow Q and 
the time £ to R,, by its mass density p(x,t), its 
pressure p(x,t) (p » 0 defined up to an additive 
constant), and its stress o(x,t) — which is a 
symmetric tensor. 

The partial differential equations describing the 
flow are satisfied in the domain of the flow and read 
as follows: 


2: + div(pu) = 0 
Ot 2] 
(s (u: yu) = divo + f 


where f denotes some external forces applied to the 
fluid. These equations describe the conservation of 
mass and the conservation of linear momentum. To 
close the system, we need a constitutive equation for 
the stress c as well as initial. conditions and 
boundary conditions. 

Moreover, most non-Newtonian fluids are practi- 
cally incompressible in most regions of the flow, so 
that we shall only consider this case: the first 
equation in [2] is replaced by condition divu =O in 
the domain of the flow. 


Notions of Continuum Mechanics 


At time t, a body S occupies a region Q, of the 
Euclidean space E3, called the configuration at time f, 
of the body. Points p of S are called material points 
or particles of fluids. The configuration Q; 
is assumed to be regular in the following sense: Q; 


is closed, its interior is connected and dense 
everywhere, its boundary is piecewise regular, C" at 
least. 

A mapping €: Qo — Q; is a deformation if ® is a 
bijection from Qo onto Q; and is a C'-diffeomorph- 
ism from the interior of Qo onto the interior of €, 
with positive Jacobian. 

The motion of a body S is given by a set of 
deformations II(t, t’): Q, — Qr, satisfying 


II(z,£2)-—]1Id, II(27,25 —II(Z,2)oIl(z,t) 


The trajectory of the material point which is in X at 
to is the set 


(t, to) X), ES to} 


A body is said to be rigid if the deformation II(t, t’) 
is an isometry for all times ¢ and ¢’. A material point 
p is said to be attached to the rigid body S if the 
body p US is rigid. 

The motion of a fluid might be described in terms 
of the Lagrangian coordinates X € O9 of each 
particle of fluid: Qo is called the reference config- 
uration and is the fixed configuration occupied by 
the body of fluid at the time of reference, say to. The 
motion of the fluid might also by described in terms 
of the Eulerian coordinates x=y(X,t), which 
represent the position of a particle at time t which 
has position X at £o. The Lagrangian and Eulerian 
coordinates of the same particle of fluid are linked 
by the differential equation 


X(X, t) = u((X, t), t), 
x(X, to) = X 


fort > to 


For defining the constitutive equations, we shall 
use a few tensors that we define now. The defor- 
mation gradient is defined by  F(X,1) — Ox 
(X,t)/OX, and the right Cauchy-Green tensor by 
C=F'F (also called Cauchy strain). To define 
relative tensors, we denote by y= x;(x,s) the 
position at time s € t of the material point, which 
is at x at time t. The relative tensors are defined in 
the following way: 


e the relative deformation gradient F;(s) = V x;(x, s), 

e the relative right Cauchy-Green tensor C;(s) = Fl (s) 
F,(s), and 

e the relative Finger tensor C,(s) '. 


Note that the rate of deformation tensor is obtained 
as the time derivative of the relative Cauchy strain 
tensor: 


. 10C;,(s) 


De 
2 Os li 


Principle of Objectivity and Frame Invariance 


A frame of reference is defined in the spacetime 
€3x R. attached to the observer by giving a 
chronology and a system of reference. The chron- 
ology is a timescale, which will be assumed to be 
the same for all observers. The system of reference 
is a set of at least four points attached to a rigid 
body (this is the observer), which are not 
coplanar. 

The constitutive equation needs to satisfy the 
principle of frame invariance and of frame indiffer- 
ence (or objectivity), which means that the equation 
does not depend on rigid motions of the observer. In 
the mathematical framework, it means that the 
equation has to be invariant under a change of 
orthonormal frame of reference x* — Q(t)x, where 
Q(t) is an orthogonal tensor: the transformed 
equation has to have the same expression, and also 
to be frame indifferent. We define a scalar quantity 
p, a vector field u, or a tensor field 7, as being frame 
indifferent if, under the change of variables 
x*=Q(t)x, they satisfy the relations y(x,t)= 
^ (x, t), u(x,t) — Q(t)‘ u*(x*, t), and r(x, t) 2 Q(t)' * 
(x*, t)O(t), respectively. 

The velocity gradient Vu is not frame indifferent, 
but its symmetric part is. The vorticity, which is the 
antisymmetric part W = (Vu — Vu! )/2 of the velo- 
city gradient, satisfies the equation w=o'w’o- 
Q! Q, where the dot denotes the convective deriva- 
tive d/dt=0/0t+ (u - V). 

Note that the convective derivative of a 
scalar function v is frame indifferent, which 
means that 


but the convective derivative of a vector or a tensor 
is not frame indifferent. 
It can be easily checked that the derivative 


Dor = dr 


pe wap IWT [3] 


of a (frame-indifferent) tensor 7 is frame indifferent, 
which means that 


To obtain another frame-indifferent derivative of a 
tensor 7, we need to start with the expression [3], to 
which we may add other terms containing frame- 
indifferent quantities, for example, combinations of 
r and D. A derivative which is often considered is 
the Oldroyd derivative, as introduced by Oldroyd in 
1958: 
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mur LC SW - Wr - a(Dr 7D) [4] 
where a is a real parameter, chosen in the interval 
[-1, 1]. (This restriction on a is necessary for 
viscometric reasons, and obtained when simple 
flows, such as Couette or Poiseuille flows, are 
studied.) 

The case a — 1 corresponds to the upper convected 
derivative, and the case a— —1 to the lower 
convected derivative. The case a=0 refers to the 
corotational or Jaumann derivative. Derivatives 
corresponding to cases a=—1,0, or 1 might 
actually be obtained by derivating 7 in a frame 
fixed locally to the body of fluid, and which rotates 
and/or deforms with the body. Moreover, we shall 
see that the derivatives corresponding to a= 1 or —1 
have very simple integral expressions. 


Constitutive Equations 


The constitutive equation of a non-Newtonian fluid 
is a nonlinear relationship between the stress tensor 
and objective variables depending on the flow, such 
as the pressure, the rate of deformation, frame- 
indifferent derivatives of such quantities, etc. 

Analogously to the constitutive equation for an 
incompressible Newtonian fluid, we may also write 
the stress tensor in the form o = — pl + 7. The extra 
stress tensor 7 could be either a function of objective 
variables, which characterize the flow, or defined by 
a differential equation or by an integral equation. 
The point here is to model the fact that the fluid 
might have some elasticity or some memory, or 
might experience, for example, yield stress or 
orientational properties. 


Shear dependent viscosity fluids A very simple 
generalization of the incompressible Newtonian 
fluid consists in making the viscosity dependent on 
the rate of deformation tensor, 7=7(D). This 
generalization has been introduced by O A Ladyz- 
henskaya in 1970 and, if the function is chosen 
properly, this model reproduces the behavior of 
existing fluids, at least in certain parts of their flow. 
For power-law fluids, the viscosity depends on the 
second invariant Ip — (1/2)tr D^ of the symmetric 
tensor D (the first invariant tr D is zero because of 
incompressibility), and reads as 


n(D) = no + ml; ! [5] 


where mo > 0, m > 0, and n > 0. If n= 1, we recover 
the Newtonian case, whereas for » « 1 this equation 
describes a shear thinning fluid, and for n > 1 a shear 
thickening fluid. The power law is not valid for Ip 
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close to 0, so that the Carreau-Yasuda law is 
preferred: 


n—1)/(2a) 


kini _ (1+ Quy) 6) 


no — Teo 


where 7j; is the zero-shear rate viscosity, 7. is the 
infinite-shear rate viscosity, \ a time constant, n a 
dimensionless power-law index, n > 0, anda » 0a 
parameter (generally equal to 1 for a monomolecu- 
lar polymer). 


Oldroyd models and related models Oldroyd mod- 
els are differential models built with one of the 
Oldroyd derivatives, and are very commonly used 
for polymer solutions or melts. The stress tensor is 
given as a solution of a differential equation in the 
following way: 


Dr ie D,D 
rm gin D) = 2n(D+ 2s Di [7] 


where A, > 0 is a relaxation time, À; is a retardation 
time, 0 < A» < Aj, and gí(z, D) is a tensor-valued 
function, constrained to certain restrictions due to 
objectivity, and which is at least quadratic. 

The Johnson-Segalman model has g = 0, and —1 < 
a «X 1. Other models of differential type often 
suppose the parameter a to be 1, because it has 
been noticed that with a close to 1 the model is able 
to reproduce some experimental behavior, whereas 
for a — — 1 or close to —1, the model does not work 
at all. Among the models with a— 1, the following 
ones are fairly popular: the model of Phan-Thien and 
Tanner has g(r, D) 2 artrr, where a is a constant; 
this model can be generalized by defining g(r, D) = 
aT? + Br,a and 8 being functions of the trace of 7 
and of its determinant; the model of Giesekus is the 
particular case where a is a constant and 9— 0. The 
Oldroyd eight-constant model is given by 


g(r.D) = po(tr7)D + r tr(rD) I 
+ ua D^ + vj tr(D?)I 


where uo, 71, 45, and 12 are constants. 

In [7], the limit case A2=0 corresponds to 
Maxwell's type models, where there is no New- 
tonian viscosity, while the case A» > 0 corresponds 
to the Jeffreys’ type models. The cases where a= 1 
and g — 0, are often considered in mathematical or 
numerical studies: this is the upper convected 
Maxwell (UCM) model for ^; — 0, and the Oldroyd 
B model for A; > 0. 

The parameters A1, À», and 7 might also depend 
on Ip: such a model where the upper convected 
derivative (a— 1) is chosen is referred to as the 
White-Metzner model, and reads as follows: 


DT D,D 
T T Nap = 2(mD + (D+ 22-2) 
where 7), is also the Newtonian viscosity. 


Integral equations Other constitutive equations for 
viscoelastic fluids include integral equations. Actu- 
ally, some differential equations have integral 
counterparts: this is the case for the differential 
equations associated with the upper or lower 
convected frame-indifferent derivatives. For the 
upper convected derivative (a=1), the extra stress 
is given by the integral expression 


Ai — A2 


M A 
r(x.t) x 


A 
= 29 D(x, t) + 21 
l 


; | 
x / e U79/^ (Vx) D(X, s( Vxx)! ds 


where X is the position, at time s, of the point which 
is at x at time ¢. A similar expression might be 
obtained for the lower convected derivative. 

A very common integral equation is the K-BKZ 
equation (introduced independently by Kaye and 
Bernstein, Kearsley, and Zapas in 1962-63). In a 
simplified form, the extra-stress tensor is given as 
the integral of a combination of the relative Cauchy 
strain tensor C, and its inverse: 


ETC Git -s oe ds 


an | (s) 


OW.) s ro 
== E cls) ds 


where 1I, — tr C." (s) and I> — tr C;(s). The function G 
is a given kernel, and W a given scalar potential. 
The upper convected Maxwell model is obtained 
from the K-BKZ model by setting W(I;, I5) — I; and 
G(s) = (4122/2) e^, 


Models issued from kinetic theories or micro-macro 
models Polymeric fluids could also be modeled by 
coupling a macroscopic viewpoint — the one of 
continuum mechanics, as described above - and 
a microscopic viewpoint. A polymer is, in general, 
made of long chains of molecules. Rather than trying to 
represent the polymer behavior by a sophisticated 
constitutive equation, one describes the mean behavior 
of the molecules by using their microscopic description. 

To take an example, we consider a dilute solution 
of polymer, where each chain of polymer is modeled 
as a collection of dumbbells, each of them consisting 
of two beads connected by a spring. The configura- 
tion of the spring, namely its length and orientation, 
is described by a random vector field O € R?. The 
dumbbells are convected and stretched by the flow. 


The probability v(x, Q, z) dO of finding a dumbbell 
with a configuration O at (x,t) is governed by a 
Fokker-Planck equation: 


d 
< + divo((Vu)Qv) 
E: 2kT 
== divo((VoW)y) + ^ov 


where ¢ is the friction coefficient of the dumbbell 
beads, T the temperature, and k the Planck constant, 
and W the spring potential. The extra stress is given 
by the constitutive equation 


r- | (vowe QWs, Q1) dQ 


The simplest potential is the linear one (also called 
Hookean potential) W(Q) = H|Q}’, where |Q| is 
the length of O, and H the elasticity constant. 
In fact, in the case of the Hookean potential, this 
set of equations is equivalent to the Oldroyd B 
model. Another potential corresponds to finitely 


extendable nonlinear elastic (FENE) chain of 
dumbbells, 
__ HQ) [en 


for |O| € Oo, and gives the FENE model, for which 
there is no macroscopic constitutive equation known. 

We have only made here a short incursion in these 
micro-macro models: research is in progress, both 


analytical and numerical (Ottinger 1996, Suen et al. 
2002, Keunings 2004). 


Liquid crystals and polymeric liquid crystals As an 
example, we present the constitutive equations for a 
uniaxial nematic liquid crystal. 

In the theory of Leslie and Ericksen, established in 
the 1960s and the 1970s, the stress tensor is given as 
a function of the orientation unit vector z, through 
the Oseen-Frank elastic energy, 


2W(n, Vn) = ki(divn)? + &a(n - curl n)? 
+ &3|n x curl n|? 
where &4 > 0,42 > 0, and «3 > 0 are the three basic 


modes (splay, twist, and bend, respectively). The extra 
stress tensor is precisely given by the relation 


Ow 
_ Terr 
r= —(Vn) au, «0 Dn)n &n 
+aNQn+a3nQ@N 
+ aD + asDn & n + agn ® 
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where N — 7 — Wn is the corotational derivative of 
the director, and oj,/—1,...,6, the six Leslie 
viscosity coefficients. 

The director satisfies a differential equation 
derived from continuum mechanics, 


pin = G +g + diva 


where pı is the moment of inertia per unit volume, 
G the external director body force (torque per unit 
volume), m the director stress tensor, and g the 
intrinsic director body force. Precisely, 


o 
g = àn — (Vn) b — T — yN — y; Dn 


OW 


SE c" 


where 8 is a Lagrange multiplier vector, and A= 
—7y2/7 is the reactive parameter, with y1 =a3 — a2 
the rotational viscosity, and y2 = Q6 — &5 = o3 + Q2 
the irrotational torque coefficient. 

Polymeric liquid crystals might have other variables 
entering in the modeling, such as order parameters, 
order tensors, etc. 

Because of the complexity of modeling, most 
studies concern either very simple flows, such as 
Couette or Poiseuille flows, or steady flows, or 
flows for which the coefficients satisfy specific 
relationships. 

Reports about earlier studies, theoretical as well 
as numerical, can be found in Coron et al. (1991), 
and references therein. The study of polymeric liquid 
crystals, or of the smectic phase of liquid crystals is 
at its very early stage and one could look into it in 
specialized journals, such as the Journal of Non- 
Newtonian Fluid Mechanics, or see Liquid Crystals. 


Yield stress fluids Bingham materials have the 
property of flowing only when the stress magnitude 
is greater than a critical value, and being a solid 
otherwise. Precisely, in the simplest and the most 
widely used model, the Bingham model, the extra 
stress tensor 7 is given by the relations 


D 
T—29gD--T, — iflpZz0 


Ir| = Ts if Ip =O 

where 7, > 0 is the yield limit. The Bingham model 
is generalized in taking the viscosity 7 to be a 
function of the shear stress: 7 is given by the 
relation 
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for the Casson law, and by the power law [5] for the 
Herschel-Bulkley model. 

The mathematical study was started by Duvaut 
and Lions (1976), and regained interest recently 
(Malek and Rajagopal 2005), especially in relation 
with other recent studies in polymeric liquids. 


Theoretical and Numerical Problems 
for Viscoelastic Flows 


The mathematical study of viscoelastic fluid flows 
amounts to studying systems of partial differential 
equations, which all include either the incompres- 
sible Euler equation or the incompressible Navier- 
Stokes equation as particular cases. In particular, it 
means that the results obtained from such a study 
are similar to the ones obtained for Euler or Navier- 
Stokes equations, and, because of the complexity of 
the system, the results are expected to be qualita- 
tively as good, actually more often less good, than 
for these equations. For example, the existence of 
weak three-dimensional solutions to the Navier- 
Stokes system is known, while for non-Newtonian 
flows, this result will be true only in very specific 
cases. Moreoever, when a result is not known for 
the Navier-Stokes problem, such as the uniqueness 
of solution for all data in a three-dimensional 
problem, there is no hope something similar could 
be proved for non-Newtonian fluid flows. 

As an example, we consider the case of Johnson- 
Segalman fluids, which are described by constitutive 
equation [7] with g=0. Recall that the limit case 
A2 =0 corresponds to the purely elastic case, and 
A2 = A; to the purely Newtonian case. Equation [7] 
is coupled with the equations of motion: 


du 


pa; Vp=divr +f 


divu = 0 


[9] 


Equations [7] and [9] have to be solved in the 
domain of the flow, which might be the whole 
space R? (or R or R? in case of symmetries), or a 
domain Q, bounded or not, in R”, n=1, 2, or 3. 
These equations are supplemented by appropriate 
boundary conditions and initial conditions for the 
velocity u and the extra stress 7 (no boundary 
condition on 7 is needed if the homogeneous 
nonslip boundary condition u= 0 is chosen). 

We first make explicit the Newtonian contribu- 
tion to the stress by setting 7—7?^-4- 7? and 
75 =2n,D. The differential equation for 7? is then 


TA T? 


TP + Aj = 2j D 


where np =(1 — A2/A1)7 is the so-called polymeric 
viscosity, 74 —(A2/A1)] the so-called Newtonian 
viscosity (or solvent viscosity). 

We then use nondimensional variables, so as to 
make explicit the characteristic parameters, which 
the flow depends on. The non-Newtonian fluid 
considered in this model will always be homoge- 
neous: its density p is a constant independent of x 
and t. The dimensional variables are now asterisked. 
We define quantities which are characteristic of the 
flow: a length L, a velocity magnitude U, a stress 
magnitude T, a force magnitude F, and a pressure P. 
We operate the change of variables and functions 
x—x'/L, u=u*/U, t= Ut* /L, and also introduce the 
nondimensional functions 


T" qu B f" 
— f F 
After choosing the parameters T, P, and F in 


an appropriate way, namely T=P=nņU/L, and 
F — qU/L?, we obtain the following system 


d | 
Re + Vp = (1 — w)Au + div T + f 
div u = 0 [10] 
Dt 
a =) D 
We Di TT w 


Here the three nondimensional parameters which 
the flow depends on are the usual Reynolds number 
Re — po9UL /r and two other numbers: the Weissen- 
berg number We — AU/L measures the elasticity per 
unit time (sometimes also called the Deborah 
number), and the parameter w= nņp/n is the ratio of 
elastic viscosity to total viscosity (w=0 corresponds 
to the Newtonian case, while w=1 corresponds to 
the purely elastic case). 

System [10] couples a transport equation (the 
equation for the stress 7), and either a Navier- 
Stokes type equation when w « 1, or a Euler type 
equation when w=1 (for the velocity u). This 
system is not hyperbolic, parabolic, or elliptic. 

Maxwell's type models (w= 1) display two striking 
phenomena. First, the Cauchy problem (with initial 
data) can present Hadamard instabilities, that is, 
instabilities to short waves. It means, in particular, that 
the Cauchy problem is not well posed in any good class 
but analytic. Moreover, the partial differential system 
for Maxwell's type steady flows may experience a 
change of type, analogous to the situation in gas 
dynamics, if the *Mach number" Re We is larger than 1. 

Jeffreys’ type models (w< 1), because of the 
presence of a Newtonian viscosity, do not exhibit 
such phenomenon, but their study does not enter in 


the theory of parabolic equations either, the type of 
the system being composite. 

Problems of interest for rheologists, as well as for 
mathematicians, include in particular the high 
Weissenberg asymptotics, the high Weissenberg 
boundary layers, the singularity of flows near a 
reentrant corner, and the stability of flows. 

We give a few details about stability questions. 
Instabilities are seen in experimental extrusion of 
melted polymers from a pipe: melt fracture designates 
different phenomena appearing at different stages of 
the experiment, when the speed of the extrusion is 
increased, such as sharkskin instability, slight distor- 
tions of the extrudate, large distortions and wavyness 
of the extrudate. One may distinguish two kinds of 
instabilities. First, constitutive instabilities are asso- 
ciated with nonmonotonicity of constitutive functions 
and loss of evolutionary property of the equations of 
motion. Other kinds of instabilities are close to 
classical hydrodynamic instabilities at increasing Re. 
Note that in viscoelastic flows the Re is usually very 
small, and might even be set to zero in some studies. 

Other mathematical questions for system [10] 
include existence of weak solutions (for the very 
special case of Oldroyd model with the Jaumann 
derivative where (a=0) in [5]), existence of regular 
solutions defined on some time interval, depending 
on the magnitude of the data, and existence of 
regular solutions for all times. Other studies concern 
the existence, uniqueness, and stability of steady 
solutions. Another field of study is the numerical 
simulation of such flows. 

In summary, there have been numerous computa- 
tions made in the field of steady or unsteady viscoelastic 
fluids, and especially models using continuum 
mechanics. Standard test problems include the cavity- 
driven flow, flows inside a 4: 1 contraction, extrusion 
flows, flows between eccentric cylinders, and flows in 
“wiggly” pipes. As mentioned already, the type of the 
sytem of partial differential equations is composite, 
neither elliptic nor hyperbolic. The numerical codes 
have to take into account the precise nature of the set of 
partial differential equations, so as to be able to obtain 
noncatastrophic results. One of the main challenges has 
been to deal with the high- We problem: with increasing 
We, the results would become totally incoherent, and 
the numerical algorithms would diverge. 

Nowadays, with the power of computers increasing, 
molecular simulations of flows are proposed, using the 
macro-micro modeling mentioned above. Also, simula- 
tions of flows of colloidal suspensions and reacting 
flows have been undertaken with success. 
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See also: Compressible Flows: Mathematical Theory; 
Fluid Mechanics: Numerical Methods; Incompressible 
Euler Equations: Mathematical Theory; Interfaces and 
Multicomponent Fluids; Inviscid Flows; Liquid Crystals; 
Newtonian Fluids and Thermohydraulics; Partial 
Differential Equations: Some Examples; Stability of 
Flows; Stochastic Hydrodynamics; Viscous 
Incompressible Fluids: Mathematical Theory. 
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Introduction 


Classical fields that enter a classical field theory 
provide a mapping from the “base” manifold on 
which they are defined (space or spacetime) to a 
"target" space over which they range. The base and 
target spaces, as well as the map, may possess 
nontrivial topological features, which affect the 
fixed-time description and the temporal evolution of 
the fields, thereby influencing the physical reality that 
these fields describe. Quantum fields of a quantum 
field theory are operator-valued distributions whose 
relevant topological properties are obscure. Never- 
theless, topological features of the corresponding 
classical fields are important in the quantum theory 
for a variety of reasons: (1) Quantized fields can 
undergo local (spacetime-dependent) transformations 
(gauge transformations, coordinate diffeomorphisms) 
that involve classical functions whose topological 
properties determine the allowed quantum field 
theoretic structures. (2) One formulation of the 
quantum field theory uses a functional integral over 
classical fields, and classical topological features 
become relevant. (3) Semiclassical (WKB) approxi- 
mations to the quantum theory rely on classical 
dynamics, and again classical topology plays a role in 
the analysis. 

Topological effects of gauge fields in quantum 
theory were first appreciated by Dirac in his study of 
the quantum mechanics for (hypothetical) magnetic 
point monopoles. Although here one is not dealing 
with a field theory, the consequences of his analysis 
contain many features that were later encountered in 
field theory models. 

The Lorentz equations of motion for a charged (e) 
massive (M) particle in a monopole magnetic field 
(B— mr/r?) are unexceptional, 


rae [1a] 
. € 
p=7PxB (c—1) [1b] 


and completely determine classical dynamics. But 
knowledge of the Lagrangian L and of the action 
I — the time integral of L: I= fdtL — is further 
needed for quantum mechanics, either in its func- 
tional integral formulation or in its Hamiltonian 
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formulation, which requires the canonical momen- 
tum Zz -—OL/Or. The  Lorentz-force action 
is expressed in terms of the vector potential 
A, B=V x A: Ioram =e [dtf -A—e [dr-A. The 
magnetic monopole vector potential is necessarily 
singular because V - B — 42m6°(r) + 0. The singular- 
ity (Dirac string) can be moved, but not removed, by 
gauge transformations, which also are singular, and 
do not leave the Lorentz action invariant. Noninvar- 
iance of the action can be tolerated provided its 
change is an integral multiple of 27, since the 
functional integrand involves exp (iJ) (with 5 — 1). 
The quantal requirement, which is not seen in the 
equations of motion, is met when 


eg — N/2 [2] 


The topological background to this (Dirac) quanti- 
zation condition is the fact that II; (U(1)) is the 
group of integers, that is, the map of the unit circle 
into the gauge group, here U(1), is classified by 
integers. 

Further analysis shows that only point magnetic 
sources can be incorporated in particle quantum 
mechanics, which is governed by the particle 
Hamiltonian H=p*/2M (magnetic fields do no 
work and are not seen in H). Quantum Lorentz 
equations are regained by commutation with 
H: r=i[H,r]|, p=i[H, p], provided 


i[r, 7] — 0 [3a] 
p^r] o dn [3b] 
[p p] = —ee" p" [3c] 


But [3c] implies that the Jacobi identity is obstructed 
by magnetic sources V- B+ 0. 


leik ip’, |p, p"]] =e V -B [4] 


This obstruction is better understood by examin- 
ing the unitary operator U(a) = exp (ia - p), which 
according to [3b] implements finite translations 
of r by a. The commutator algebra [3] and 
the failure of the Jacobi identity [4] imply 
that these operators do not associate. Rather one 


finds 


U(ai)(U(a2)U(a3)) = e^(U(ai)U(a2))U(as) [5] 


where =e [d'x V.B is the total flux emerging 
from the tetrahedron formed from the three vectors 
a; with vertex at r (see Figure 1). But quantum 
mechanics realized by linear operators acting on a 
Hilbert space requires that operator multiplication 
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Figure 1 Tetrahedron pierced by magnetic flux that obstructs 
associativity. 


be associative. This can be achieved, in spite of [5], 
provided ® is an integral multiple of 27, hence 
invisible in the exponent. This then needs that (1) 
V.B be localized at points, so that the volume 
integral of V-B retain integrality for arbitrary a; 
and (2) the strengths of the localized poles obey 
Dirac quantization. The points at which V-B is 
localized can now be removed from the manifold 
and the Jacobi identity is regained. The above 
argument, which rederives Dirac's quantization, 
makes no reference to gauge variance of magnetic 
potentials. 

In the remainder we shall discuss related phenom- 
ena for selected gauge field theories in four, three, 
and two dimensions that describe actual physical 
events occurring in nature. We shall encounter in 
generalized form, analogs to the above quantum 
mechanical system. 

Some definitions and notational conventions: 
Nonabelian gauge potentials A7 carry a spacetime 
index (u) (metric tensor g,,-—diag(1, —1,...)) and 
an adjoint group index (a). When contracted with 
anti-Hermitian matrices T, that represent the 
group's Lie algebra (structure constants f^) 


pi^ Ty] = T iz [3 
they become Lie algebra-valued. 
A, = Al Ta [7] 


Gauge transformations transform A, by group 
elements U: 


A, — A; ZU 'A,U-- U^! O,U [8a] 


For infinitesimal gauge transformations, U = I + A, 
àA = MT; this leads to the covariant derivative D,,: 


Ay — Ay + OA + [Aj] = Ap + DÀ " 
A? — Al + ON + fu" Ap X = AS + (DA 


(In a quantum field theory, A,, becomes an operator 
but the gauge transformations U, À remain c-number 
functions.) The field strength F,, given by 


Fiv = 0, Ay — 0, Ay + [Ap Av] [9a] 
is also given by 
[D Dipl ass = aos] [9b] 


(coupling strength g has been scaled to unity). The 
definition [9] implies the Bianchi identity 


D, Fa + Dak ai + Dy Fay 29 [10] 
Here F,, is gauge covariant 
Fa FÉ. =U PU [11a] 
or, infinitesimally, 
Fu — Fus [Fus AJ [11b] 


In the gauge invariant Yang-Mills action yw, the 
Yang-Mills Lagrange density Lym is integrated over 
the base space, 


LYM = Str FP Fo 
1 " [12] 
lym Lym = 2 tr F' Fu 


The trace is evaluated with the convention 
tlm = - bab [13] 


and henceforth there is no distinction between upper 
and lower group indices. The Euler-Lagrange condition 
for stationarizing [yyy gives the Yang-Mills equation 


y p = 0 [14a] 
Should sources J” be present, [14a] becomes 
D, PF = [14b] 
and J” must be covariantly conserved: 
D,]"-D,D,F"" = —ID,, Di] F” 
= - MF, P") =0 as 


All this is a nonabelian generalization of familiar 
Maxwell electrodynamics. 


Gauge Theories in Four Dimensions 


Gauge theories in four-dimensional spacetime are at 
the heart of the standard particle physics model. 
Their topological features have physical conse- 
quences and merit careful study. 
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Yang-Mills Theory 


In four dimensions, we define nonabelian electric E? 
and magnetic B^ fields, 


EM= Fs, BY =the, ^ [16 
Canonical analysis and quantization is carried out in 
the Weyl gauge (Aj = 0), where the Lagrangian and 
Hamiltonian (energy) densities read 


Lym = XE - E° — B^ - B^) [17] 


Hym = HE" - E” + B^ - B^) [18] 


The first term is kinetic, with E"— —0,A^ also 
functioning as the (negative) canonical momentum 
z^, conjugate to the canonical variable A^; the 
second magnetic term gives the potential. In the 
Weyl gauge, the theory remains invariant against 
time-independent gauge transformations. The time 
component of equation [14] (Gauss law) is absent 
(because there is no A$ to vary); rather it is imposed 
as a fixed-time constraint on the canonical variables 
E^ and A*. This regains the Gauss law: 

(D.E) —0 (in the absence of sources) [19a] 

In the quantum theory D - E annihilates “physical” 
states. Explicitly, in a functional Schrödinger repre- 
sentation, where states are functionals of the canonical 
fixed-time variable A|W) — (A), [19a] requires 


(>. 5) wa) =0 


that is, physical states must be invariant against 
infinitesimal gauge transformation, or equivalently, 
against gauge transformations that are homotopic 
(continuously deformable) to the identity (the so-called 
“small” gauge transformations) 


[19b] 


V(A + DA) = V(A) [20] 
But homotopically nontrivial gauge transformation 
functions that cannot be deformed to the identity 
(the so-called “large” gauge transformations) may 
be present. Their effect is not controlled by Gauss' 
law, and must be discussed separately. 

Fixed-time gauge transformation functions 
depend on the spatial variable r:U(r). For a 
topological classification, we require that U tend to 
a constant at large r. Equivalently, we compactify 
the base space R? to S°. Thus, the gauge functions 
provide a mapping from S? into the relevant gauge 
group G, and for nonabelian compact gauge groups 
such mappings fall into disjoint homotopy classes 


labeled by an integer winding number 
n: IP(G) 2 Z. Gauge functions U, belonging to 
different classes cannot be deformed into each 
other; only those in the “zero” class are deformable 
to the identity. An analytic expression for the 
winding number w(U) is 


w(U) 


xe"*tr(U-'6,UU'0,UU'A,U) [21] 


24? 


This is a most important topological entity for 
gauge theories in four-dimensional spacetime, that is, 
in 3-space, and we shall meet it again in a description 
of gauge theories in three-dimensional spacetime, 
that is, on a plane. Various features of w expose its 
topological character: (1) w(U) does not involve a 
metric tensor, yet it is diffeomorphism invariant. 
(2) w(U) does not change under local variations of U: 


&u(U) -5 J d?x üje"  tr(U- 8GUU-! QUU! ,U) 


E e*ie(U-!8UU  à,UU ' QU) 
87^ 
—Q [22] 


The last integral is over the surface (at infinity) 
bounding the base space and vanishes for localized 
variations 6U. In fact, the entire w(U), not only its 
variation, can be presented as a surface integral, but 
this requires parametrizing the group element U on 
R?. For example, for SU(2), 

LI = esp E 


à = 07/21 (c = Pauli matrices) 


o(U) =r = | ds! cite, 520,58, 4¢(sin |] — |Al) 


|A| = v AA, 


Specifically, with || —2 2nn (so that U —3 +I), 
w(U)-— — n. As befits a topological entity, w w(U) is 
determined by global (here large distance) properties 
of U. 

Since all gauge transformations, small and large, 
are symmetry operations for the theory, [20] should 
be generalized to 


M4 = XUI [23] 


P(A”) = e"*w(A) [24] 


where 0 is an universal constant. Thus, Yang-Mills 
quantum states behave as Bloch waves in a periodic 
lattice, with large gauge transformations playing the 
role of lattice translations and the Yang-Mills vacuum 
angle 0 playing the role of the Bloch momentum. This 
is further understood by noting that the profile of the 
potential energy density, 5 B^ - B^ possesses a periodic 
structure symbolically depicted in Figure 2. 
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Energy density 


-2 —1 0 +1 +2 
-4— |nstanton —» 


Figure 2 Schematic for energy periodicity of Yang-Mills fields. 


Thanks to Gauss’ law, potentials A that differ by 
small gauge transformations are identified, while 
those differing by large gauge transformations give 
rise to the periodicity. Zero energy troughs corre- 
spond to pure gauge vector potentials in different 
homotopy classes n: A= — UE AU, 

The 0 angle (Bloch momentum) arises from 
quantum tunneling in A space. Usually, in field 
theory tunneling is suppressed by infinite energy 
barriers. (This gives rise to spontaneous symmetry 
breaking.) However, in Yang-Mills theory there are 
paths in field space that avoid such barriers. 
Quantum tunneling paths are exhibited in a semi- 
classical approximation by identifying classical 
motion in imaginary time (Euclidean space) that 
interpolates between classically degenerate vacua 
and possesses finite action. 

In Yang-Mills theory, continuation to imaginary 
time, x? —= ixt, places a factor of i on E". Zero 
(Euclidean) energy is maintained when E? = +B’, or 
with covariant notation in Euclidean space, 


1 web p. y = * PAY — gu [25] 


Euclidean finite action field configurations that 
satisfy [25] are called self-dual or anti-self-dual 
instantons. By virtue of the Bianchi identity [10], 
instantons also solve the field equation [14a] in 
Euclidean space. Since the Euclidean action may also 
be written as 


if | 
lym — 4 | d'x tr(F"" 4 * P^)(F,, + *Fyy) 


l 
T5 J d'x tr F” F [26] 


and the first term vanishes for instantons, we see 
that instantons are characterized by the last term, 
the Chern-Pontryagin index, 


1 44 * pv 
-z7 f^ x tr(*F"" E,,) 


1 | 
= = 32:2 [atx grea tE( BogP ur) [27] 


P= 


This again is an important topological entity: 


1. The diffeomorphism invariant P does not involve 
the metric tensor. 
2. P is insensitive to local variations of A,, 


ior 
ôP = Er | d*x tr("F""8F,,) 


= € / d'x tr("F"”D,,6A,) 
= E / d'xtr(D,'FP"6A,) —0 [28] 
3. P may be presented as a surface integral owing to 
the formula 
ste P" Ep = OR? [29] 
K" = cl" tr(14,05A. + IA,AgA. ) [30] 


where K" is the Chern-Simons current, 


1 . 
P — -ga | aS, [31] 


The integral [31] is over the base space boundary, 
S?. The Chern-Pontryagin index of any gauge field 
configuration with finite (Euclidean) action (not 
only instantons) is quantized. This is because finite 
action requires F,, to vanish at large distances; 
equivalently, A,— U !O9,U. Using this in [30] 
renders [31] as 


1 : ! 
- palsy 
P 74,2 | dS,,€ 
x tr(U-'d,UU-'dg,UU'0,U) [32] 


which is the same as [20] and, for the same reason, 
is given by an integer [I? (G) = Z]. Alternatively, for 
instantons in the (Euclidean) Weyl gauge (A4 — 0), 
which interpolate as xf passes from —oo to +00 
between degenerate, classical vacua A;—0 and 
A; = —U~!V,U, P becomes 


P= | dx! d’x (OsK* +V-K) 


1 
= ga | PK m 


o 1 3 -ijk -1 9, -12. E iF: 
=z | 4 xet: tr(U-'0, UU '0,UU tU) 
= w(U) [33] 
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We have assumed that the potentials decrease at 
large arguments sufficiently rapidly so that the 
gradient term in the first integrand does not 
contribute. This rederivation of [32] relies on the 
“motion” of an instanton between vacuum config- 
urations of different winding numbers. 

An explicit 1-instanton SU(2) solution (P = 1) is 


—2i 


Bee 
^o (x-e «p 


(Fay [34] 


(Upon reinserting the coupling constant g, which 
has been scaled to unity, the field profiles acquire 
the factor g!.) In [34] o, = (1/4i)(o1.o, — 
010,),0, —(—io,I) &£ is the “location” of the 
instanton, p is its "size," and there are three more 
implicit parameters fixing the gauge, for a total of 
eight parameters that are needed to specify a single 
SU(2) instanton. One can show that there exist N 
instanton/anti-instanton solutions (P=N/—N) and 
in SU(2) they depend on 8N parameters. From [26] 
we see that at fixed N, instantons minimize the 
(Euclidean) action. Explicit formulas exist for the 
most general N=2 solution, while for N > 3 
explicit formulas exhibit only 5N +7 parameters. 
But algorithms have been found that construct 
the most general 8N-parameter instantons. The 
l-instanton solution is unchanged by SO(5) 
rotations, the maximal compact subgroup of the 
SO(S, 1) conformal invariance group for the 
Euclidean 4-space Yang-Mills equation [14a]. 

The Chern-Pontryagin index also appears in the 
Yang-Mills quantum action, for the following 
reason. Since all physical states respond to gauge 
transformations U, with the universal phase 70 
[24], physical states may be presented in factorized 
form, 

(A) = ei" WA A) [35] 
where (A) is invariant against all gauge transfor- 
mations, small and large, while the phase response is 


carried by W(A), 

W(A'") = W(A) +n [36] 
An explicit expression for W(A) is given by 
—(1/4n*) [ dx K?, where K? is the time (fourth) 


component of K”, with dependence on the fourth 
variable suppressed, that is, K? is defined on 3-space, 


W(A) = -ga | dre tr(5A;0jA, T 3AIAjA,) [37] 


The gauge transformation properties of W(A) are 
W(A") 
| 1 e | g 
= W(A) + gaa | Prea tr(Q;jUU l Ag) 


1 
2477 


+ / d'xe'*t(U-*89;U U-'8,U U-19,U) 


[38] 


The middle surface term does not contribute for 
well-behaved A; the last term is again w(U), the 
winding number of the gauge transformation U. 
Thus, [36] is verified. 

The universal gauge-varying phase e®W'A), which 
multiplies all gauge-invariant functional states, may 
be removed at the expense of subtracting from the 
action 


Ji d*x à, W (A) = -a | as aK? = OP 
4072 


(as in [33]). Thus, the Yang-Mills quantum action 
extends [12] to 


167? 


] 
[yy mnm = / d'xtr t PUE at 


*F^" 2 [3 9] 


The additional Chern-Pontryagin term in [35] 
does not contribute to equations of motion, but it is 
needed to render all physical states invariant against 
all gauge transformations, large and small. With this 
transformation, one sees that the Ó-angle is a 
Lorentz invariant, but CP noninvariant effect. 
Evidently, specifying a classical gauge theory 
requires fixing a group; a quantized gauge theory is 
specified by a group and a 6-angle, which arises 
from topological properties of the gauge theory. The 
energy eigenvalues depend on 6, and distinct 6’s 
correspond to distinct theories. 

Note that the reasoning leading to [24] and [39] 
relies on exact quantum-mechanical arguments, 
while the instanton-based tunneling discussion is 
semiclassical. 


Adding Fermions 


When fermions couple to the gauge fields, the 
previously described topological effects are modified 
by action of the chiral anomaly. Dirac fields, either 
noninteracting but quantized, or unquantized but 
interacting with a gauge potential through a 
covariantly conserved current //, Li — —J/A7, also 


H? 


possess a chiral current jÉ = Y y” ys, which satisfies 
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Here m is the mass, if any, of the fermions. j$ is 
conserved for massless fermions, which therefore 
enjoy a chiral symmetry: w — e'^ 54. However, 
when the interacting fermions are quantized, there 
arises correction to [40]; this is the chiral anomaly: 


On Fs) 4= 2im (v 50) A + Piia ey [41] 


C is determined by the fermion quantum numbers 
and coupling strengths. (For a single charged (e) 
fermion and a U(1) gauge potential, C —&^/8z7.) 
(|)4 signifies the fermionic vacuum matrix element 
in the presence of A,. The modified equation [41] 
indicates that even in the massless limit chiral 
symmetry remains broken due to the anomaly, 
which arises with quantized fermions. 
(i), may also be presented as 


Us)47 tras" Qo) 4 [42] 


In Euclidean space (ww), is the coincident-point 
limit of the resolvent R(x,y;j) for the Dirac 
equation, 


s(x) at 
R(x,y;u) = >> Sana [43] 


Here wv, is an eigenfunction of the massless, 
Euclidean Dirac operator in the presence of the 
gauge field A,,, 


iy" (O, + Aye = ep. [44] 


The coincident-point limit is singular, so R must be 
regulated: R — R — Rreg (we do not specify the 
regularization procedure). It then follows that 


Ah 
a 5 UI! (x)¥5 vex) 
0, (#-) = 2iu ) ee LL tres. R 
} (Js) H qum YS ^Y Cu I Reg 


€ 


A yi (x) Ys Ve (x) *Ljiv a pa 
-Hap m 7 C'F E v [45] 


€ 


The first term on the right-hand side is the (Euclidean 
space) analog of the mass term in [40] or [41], while 
the second survives even after the regulators are 
removed, giving the anomaly tr * F""F,,. 

The anomaly formula [41], or more explicitly 
[45], is also the local form of the Atiyah-Singer 
index theorem, which follows after [45] is integrated 
over all space: The left-hand side integrates to zero. 
The integral of the first term on the right-hand side, 
[dxw*ysu., vanishes for € z 0 by orthogonality, 
because ysy is an eigenfunction of [44] with 
eigenvalue —e. Only zero modes contribute to the « 
sum since these can be chosen to be eigenfunctions 
of ys, n+ of them satisfying Yo = 4-ysvo. For a single 
multiplet, the normalizations work out so that 


1 -— 
n.-n.— 5 fa xt Pip [46] 


The result that the (signed) number of zero modes is the 
Chern-Pontryagin index is an instance of the Atiyah- 
Singer theorem. (In specific applications, one can 
frequently show that 7, or n_ vanishes.) It, therefore, 
follows that in the background field of instantons, the 
Euclidean Dirac equation possesses zero modes. 

Another viewpoint on the chiral anomaly arises 
within the functional integral formulation, where the 
exponentiated action is constructed from unquantized 
fields, over which the functional integration is 
performed. Here the classical action retains chiral 
symmetry 7 — e°, but the Grassmann fermion 
measure ddy, once it is properly regularized, looses 
chiral invariance and acquires the anomaly, 


dydy — duds exp iC / d'xatr'P"F,, [47] 


Evidently, the chiral anomaly involves the gauge- 
theoretic topological entity, the Chern-Pontryagin 
density. Not unexpectantly, the anomaly phenom- 
enon affects significantly the topological properties 
of the gauge theory that are connected to P and 
were described previously. 

When there is (at least) one massless fermion 
coupling to the Yang-Mills fields, the Yang-Mills 
0-angle looses physical relevance. This is because a 
chiral transformation that redefines the massless 
Dirac field does not modify the classical action, but 
owing to the chiral noninvariance of the functional 
measure, [47], an anomaly term is induced in the 
(effective) quantum action. The strength of this 
induced term can be fixed so that it cancels the 
0-term in [39]. Since field redefinition cannot affect 
physics, the elimination of the 0-term indicates that it 
had no physical relevance in the first place. In 
particular, energy eigenvalues no longer depend on 0. 

An alternate argument for the same conclusion is 
based on the functional determinant that arises 
when the functional integral is performed over the 
massless Dirac field: det[y"(0, + A,)]. The semi- 
classical tunneling analysis of the 0-angle is based on 
instantons, but in the presence of instantons the 
Dirac equation has a zero mode [46]. Consequently 
the determinant vanishes, tunneling is suppressed 
and so is the 6-angle. 

However, in the standard model for particle 
physics, there are no massless fermions, so the 
presence of the 0-angle entails the following physical 
consequences. The tunneling amplitude F in leading 
semiclassical approximation is determined by the 
Euclidean action, namely the continuation of ily, in 
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[39] to imaginary time. This results in the same 
expression except that the topological 6-term 
acquires a factor of i. Only the 1-instanton and 
anti-instanton give the dominant contribution, 


T x cose 97/8 [48] 


where the coupling constraint g has been reinserted; 
the proportionality constant has not been computed, 
owing to infrared divergences. (Higher-instanton- 
number configurations contribute at an exponen- 
tially subdominant order and have thus far played 
no role in physics.) The tunneling leads to baryon 
decay, but fortunately at an exponentially small 
rate. More useful is the fact that instanton tunneling 
gives semiclassical evidence for the removal of an 
unwanted chiral U(1) Goldstone symmetry, which 
would be present in the standard model if the chiral 
anomaly did not interfere. Furthermore, the chiral 
anomaly facilitates the decay of the neutral pion to 
two photons; a process forbidden by other apparent 
chiral symmetries of the standard model, which in 
fact are modified by the chiral anomaly. Gauge 
fields in four dimensions must interact with anomaly 
free currents. This necessitates a precise adjustment 
of fermion content and charges so that the anomaly 
coefficients (analogs of *C" in [41]) vanish for 
currents coupled to gauge fields. Finally, 60 
provides a tantalizing source of CP violation in the 
strong-interaction sector of the standard model. But 
no experimental signal (e.g., neutron electric dipole 
moment) for this effect has been seen. At present, we 
do not know what mechanism is responsible for 
keeping 0 vanishingly small. 

These are the physical consequences of topologi- 
cal effects in four-dimensional gauge theories. 
Although they have provided experimentalists with 
only a few numbers to measure (e.g., 7° — 27 decay 
amplitude, prediction of anomaly-free arrangements 
of quarks and leptons in families), they have added 
enormously to our appreciation of the complexities 
of quantized gauge theories. 

That chiral anomalies are an obstruction to 
consistent gauge interactions can be established 
within perturbation theory. A similar, but nonper- 
turbative effect is seen in an SU(2) gauge theory with 
N Weyl fermion (ys — +7) SU(2) doublets, which 
lead upon functional integration to det [5^(O,, + 
AI, But because II*(SU(2)) = Z>, there exists a 
single homotopy class of gauge transformations 
which are not deformable to the identity. One 
shows that the determinant changes sign when 
such a gauge transformation is performed. Thus, 
the theory is ill-defined for odd N. Consistent SU(2) 
gauge theories must possess an even number of Weyl 


fermion doublets, but such models have not found a 
place in physical theory. 


Adding Bosons 


Instantons are finite-action solutions to classical 
equations continued to imaginary time; they provide 
a semiclassical description of quantum-mechanical 
tunneling. A field theory may also possess finite- 
energy, time-independent (static) solutions to the 
real-time equations of motion. When these solutions 
are stable for topological reasons, they are called 
“solitons.” Solitons give semiclassical evidence for 
the existence in the quantum field theory of a 
particle sector disjoint from the particles obtained 
by quantizing field fluctuations around the vacuum 
state. The soliton particles are heavy for weak 
coupling g. (Their energy is O(1/g^); the field 
profiles are O(1/g).) They do not decay owing to 
the conservation of “charges” that do not arise from 
Noether's theorem but are topological. 

Yang-Mills theory does not possess soliton solu- 
tions (except in five-dimensional spacetime, where 
the static solitons are just the four-dimensional 
instantons discussed previously). However, when a 
gauge theory, based on a simple group is coupled to 
a scalar field that undergoes symmetry breaking to 
U(1), soliton solutions exist. These are the ‘t Hooft- 
Polyakov magnetic monopoles, found in a SU(2) 
gauge theory with scalar fields in the adjoint 
representation, as well as various generalizations. 
The topological consideration that arises here con- 
cerns finite energy of the static, scalar field multiplet 
p, which in the Weyl gauge is 


Elp) = [eser ey + Ve) we 


V is non-negative and possesses non trivial symmetry 
breaking zeroes. On the sphere S? at spatial infinity, 
y must tend to such a zero. Thus, the fields belong 
to G/H, where G is the gauge group and H the 
unbroken subgroup. For the ‘t Hooft-Polyakov 
monopole these are SU(2) and U(1), respectively, 
and the scalar field provides a mapping of the sphere 
at infinity S? to $? ~ SU(2)/U(1). 

One now considers II^(S^) =TM? (SU(2)/U(1)) = 
II'(U(1) =Z, and one shows that the magnetic 
flux is determined by the winding number. Hence, 
the magnetic charge is quantized. Explicitly, the 
electromagnetic U(1) gauge field is given by 


^ pi 2 ^ d f ALL cL 
fe = Q^ t cabh (Dg) (Dy) 
= 0,4, [50] 


am: Su a = | 
m — p A, 3 COS ad, p 


= Oyudy 
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parametrized as 
The manifestly 


where Q^ is the unit isovector, 
£^ =(sinacos B,sin a sin f, cos a). 
conserved magnetic current 


fn = Oy 51a] 
is rearranged to read 
jn = Fen EabcÂa POG" O,—° [Sb] 


and is nonvanishing because q^ possesses zeroes, 
where ô? acquires localized singularities. The 
magnetic charge 


__ 1 [3,01 [grq. 
— a | xin =z PV- [52] 


(b! — U(1) magnetic field: — +f fip —*f'?) is given by 
the topological entity (Kronecker index of the 
mapping) 


1 , | 
E ee pO ig^ ne 
2g, fase EabeP OP One 
= — gfe ei^ g, cos aO, B [53] 


which readily evaluates the integer winding number. 

The theory also supports charged magnetic mono- 
pole solutions called *dyons." Here the profiles 
involve time-periodic gauge potentials, where the 
time variation is just a gauge transformation 
dA — D,A. (Gauge-equivalent, static expressions 
have slow large-distance fall-off, which is removed 
by the time-dependent gauge function.) For dyons, 
the integer valued Chern-Pontryagin index, with the 
integration taken over all space and in time over the 
dyon period, reproduces the magnetic monopole 
strength. 

Regrettably, these fascinating structures are not 
found in nature. Nor do they arise in the standard 
model, whose structure group is not simple, 
although speculative grand unified models, with 
simple G and H —SU(3) x U(1), would support 
magnetic monopoles and dyons. While challenged 
physically, the magnetic monopole phenomena have 
produced extensive and interesting mathematical 
analysis. 


Gauge Theories in Two Dimensions 


Two-dimensional gauge theories have only a few 
physical applications; edge states of the planar 
quantum Hall effect can be described by excitations 
moving on a line. However, the abelian model with 
fermions is useful in that it provides a very accurate 


reflection of topological behavior in the physically 
important four-dimensional theory. 


Abelian Gauge Theory 


Take the spatial interval to be [- L, L]. Homotopi- 
cally nontrivial gauge transformations satisfy A(L) — 
A(-L) «2n (II! U(1)— Z). States V(A) of the free 
gauge theory that satisfy Gauss’ law and respond 
with a 0-angle are 


P(A) = exp5 [dx A 
2r 
V(A + 0d) = e"^w(A) 


[54] 


In this model, 0 has the interpretation of a constant 
background electric field E= —0/27, 


EWV(A) — £W(A), Ez Foi 
6 0 [55] 
This also gives the energy eigenvalue: 
[eg A) XE 56 


The phase may be removed by adding to the 
Lagrangian —(0/27) [dx0,A; equivalently, the 
action becomes 


fe = [os (- pP Pee d a gl Fw) [57a] 


which apart from a constant is also given by a 
formula with the background field: 


rx =; [axe £y 
Because of gauge invariance, there is only one state, 
annihilated by E and carrying energy 1 f dx £*. 
Distinct 0 (different £) correspond to distinct 
theories. 

We recognize in [57a] 
Chern-Pontryagin density, 
derivative to the action, 


[57b] 


the two-dimensional 
contributing a total 


L fam, omm 
€ i, J4 xe” Ep [58] 
the Chern-Simons current, whose divergence is P, 
KF = 1 wA, [59] 
AT 


and the Chern-Simons term, which carries the phase 


of V 
0 =J 
n E dx A 160] 
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For Euclidean-space gauge potentials, which are 
given at large distance by the pure gauge 
2antan ! y/x, P —n. All this is just as in the four- 
dimensional theory, except there are no instantons 
and no tunneling. 


Adding Fermions 


The addition of massless fermions to the U(1) gauge 
theory results in the Schwinger model of massless 
quantum electrodynamics in two-dimensional space- 
time. The equation of motion becomes 


Q, P^ = P [6 1] 


with the vector current constructed from the Dirac 
fields as J” = y4”. This current remains conserved 
in the quantized version because it couples to the 
gauge field. But the axial vector current j^ = wy"y5¥) 
acquires an anomaly that involves the Chern- 
Pontryagin density in [58], 


"CENE MUT 

Onis = Ted Fu [62] 
The model is readily solved, and shows no 6-angle 
(background field) dependence in physical quanti- 
ties. The solution is directly obtained by combining 
[61] with [62] into a second-order differential 
equation and using the matrix identity of two- 
dimensional Dirac (= Pauli) matrices: e” yyys — 4^. 

It follows that 


(n &L)E- o i63] 


So the theory describes a free massive photon (mass 
squared — 1/7 in units of b and the coupling 
constant, which have been scaled to unity), with no 
sign of a 0-angle (background field). 

However, in parallel with four-dimensional beha- 
vior, the model with massive fermions regains a 0 
dependence in the particles’ energy spectrum; a 
result that is established perturbatively, because a 
complete solution is not available. 

Note that in the Schwinger model, the gauge 
particle (“photon”) acquires a mass, even though 
local gauge invariance is preserved. This happens 
essentially for topological/anomaly reasons. Such 
topological mass generation is met again in three 
dimensions. 


Adding Bosons 


Scalar electrodynamics with a negative mass squared 
term in (3 + 1)-dimensional spacetime leads to the 
Higgs mechanism and short-range interactions due 
to the massive photons. In (1 + 1) spacetime dimen- 
sions, the model possesses instantons — scalar and 


gauge field profiles that solve the imaginary-time 
equations of motion — labeled by II'(U(1))=Z. 
These disorder the Higgs condensate so that the 
force between charged particles remains long-range, 
like in the positive mass-squared case. This is a vivid 
example of how excitations arising from nontrivial 
topological issues significantly effect physical 
content. 


Gauge Theories in Three Dimensions 


Gauge theories on three-dimensional spacetime, that 
is, evolving on a plane, have physical application to 
planar phenomena, like the quantum Hall effect. 
Also, the high-temperature limit of four-dimensional 
field theories is governed by the corresponding field 
theory in three Euclidean dimensions. 

In three (more generally, odd) dimensions, there 
are no Chern-Pontryagin quantities, no Chern- 
Simon currents, no axial vector currents or anoma- 
lies (there is no ys matrix). These are replaced by 
odd-dimensional entities that can modify Yang- 
Mills dynamics. 


Yang-Mills and Other Gauge Theories 


Using the three-index Levi-Civita tensor, one can 
construct a gauge-covariant, covariantly conserved 
vector, which can be added to the Yang-Mills 
equation. Thus, [14] can be modified to 


or, equivalently, in terms of the dual-field strength 
* FH = j Fag, 


ge pu Fa +m" FU = J” [64b] 


For dimensional balance, m carries dimension of 
mass. Indeed, in the source-free case [64] implies 


(D*D, +m Y F, = Empl F^, P] [65] 


This shows that excitations are massive, even 
though local gauge invariance is preserved. Other- 
wise, as in the Dirac monopole case, the equations 
of motion are unexceptional. 

However, for the quantum theory we need the 
action, whose variation produces the mass term in 
[64]. This is just the Chern-Simons term W(A) in 
[37], multiplied by —87^m and now defined on 
(2 + 1)-dimensional spacetime: 


Ics = 2m fax e^"'tr(1A,05A., +4A,A3A,) [66] 


Everything holds also in the abelian theory; the last 
term in [66] is then absent. 
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[In this model, the mass is generated by a 
topological mechanism since [cs possesses the usual 
attributes for a topological entity: it is diffeomor- 
phisms invariant without a metric tensor; when 
the potentials are appropriately parametrized, it is 
given by a surface term. (In the abelian case, 
the appropriate parametrization is in terms of 
Clebsch decomposition, A, —9,0 + a9,8.) Most 
importantly, in the nonabelian theory [66] changes 
by 8z?mn with three-dimensional gauge transforma- 
tions carrying winding number m. Hence, for 
consistency of the nonabelian quantum theory, m 
must be quantized as n/4z (in units of 5 and the 
coupling constant, which have been scaled to unity). 
All this is a clear field-theoretic analog to the 
quantum mechanics of the Dirac monopole, and 
just as for the magnetic monopole, a Hamiltonian 
argument for quantizing m can be constructed, as an 
alternative to the above action-based derivation. 

The time component of [64] relates the electric 
and magnetic fields to the charge density: 


D-E—mB= p [67] 


In the abelian case, the first term involves a total 
derivative and its spatial integral vanishes, leaving a 
formula that identifies magnetic flux with a total 
charge. At low energy, the mass term dominates the 
conventional kinetic term in [64], and the flux— 
charge relation becomes a local field-current 
identity, 


m' FY x J” 168] 


These formulas have made Chern-Simons-modified 
gauge theories relevant to issues in condensed matter 
physics, for example, the quantum Hall effect. In the 
abelian case, m need not be quantized. 


Adding Fermions 


Three-dimensional Dirac matrices are minimally rea- 
lized by 2 x 2 Pauli matrices. As a consequence, a mass 
term is not parity invariant; also, there is no y5 matrix, 
since the product of the three Dirac (— Pauli) matrices 
is proportional to I. While there are no chiral 
anomalies, there is the so-called parity anomaly: 
integrating a single doublet of massless SU(2) fermions 
one obtains A(A) = det[?" (ið, + A,)], which should 
preserve parity and gauge invariance. 

Since there are no anomalies in current divergences, 
A(A) is certainly invariant against infinitesimal gauge 
transformations. But for finite gauge transformations 
(categorized by IP(SU(2) = Z) one finds that A(A) is 
not invariant: when the gauge transformation belongs 
to an odd-numbered homotopy class, A(A) changes 
sign. To regain gauge invariance, one must either work 


with an even number of fermion doublets or, if only 
one doublet (more generally, odd number) is to be 
used, one must add to the gauge Lagrangian a parity- 
violating Chern-Simons term with half the correctly 
quantized coefficient, to neutralize the gauge non- 
invariance of A(A). 

Alternatively, A(A) can be regularized in a 
gauge-invariant manner. But this requires massive, 
Pauli-Villars regulator fields, which produce a parity- 
violating expression for A(A). One cannot avoid the 
parity anomaly. 


Adding Bosons 


There are a variety of bosonic field models that one 
may consider: Abelian or nonabelian; with conven- 
tional kinetic term or supplemented by the Chern- 
Simons topological mass; or, for low energy, no kinetic 
term but only the Chern-Simons term, as in [68]. 
Abelian charged Bose fields in a Maxwell theory lead 
to vortex solitons, based on II! (U(1)) =Z. These are 
just the instantons of the (1 + 1)-dimensional bosonic 
gauge theory discussed previously. With Maxwell 
kinematics there are no charged vortices, but these 
appear when the Chen-Simons mass is added; see [67]. 
Pure Chern-Simons kinematics, with no Maxwell 
term, can produce completely integrable soliton 
equations (Liouville, Toda) when the Bose field 
dynamics is appropriately chosen. 


Conclusion 


Topological effects in field theory are associated with 
the infinities and regularization that beset quantum 
field theories. These give rise to the chiral anomaly, 
parity anomaly (and scale symmetry anomalies, not 
discussed here). Yet the anomalies themselves are finite 
quantities that have topological significance (Atiyah- 
Singer, Chern-Pontryagin, Chern-Simons). This para- 
doxical pairing has not been understood. Nor can we 
explain why the anomalies interfere in a topological 
manner with symmetries associated with masslessness. 
Although the range of topological effects in gauge 
theory is large, and even larger in non-gauge theories 
(sigma models, Skyrme models) the relevance to actual 
fundamental physics is confined to the 0-angle phe- 
nomenon, which is analyzed accurately and abstractly 
by reference to II?(G) and to the interplay with 
fermions through the chiral anomaly. Instantons are 
relevant only to an approximate, semiclassical discus- 
sion. Although after much mathematical work, general 
instanton configurations are well understood, only the 
| -instanton solution enjoys physical significance. 
Other topological entities that fascinate are either 
nonexistent in fundamental physics or are relevant to 
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condensed matter physics (vortices, Chern-Simons 
effects). But here too, we note that the funda- 
mental equation of condensed matter physics — the 
many-body Schrödinger equation — carries no evident 
topological structure. Only the phenomenological 
equations, which replace the fundamental one, give 
rise to topological intricacies. 
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Introduction 


Quantum mechanics was born at the beginning of the 
twentieth century with the quantization rules for the 
harmonic oscillator and for the hydrogen atom. Such 
rules were almost immediately extended to more 
general systems by the so-called Bohr-Sommerfeld 
quantization rule: “the actions of the classical system 
can assume only those values which are integer 
multiples of 5." However, the actions are defined 
only in some special situations and, moreover, at the 
present time the Schródinger equation is the paradigm 
of quantum mechanics. A question naturally arises: is 
there any relation between the eigenvalues of the 
Schródinger operator and the numbers obtained by 
Bohr-Sommerfeld quantization rule (when available)? 

According to common wisdom, the “Bohr— 
Sommerfeld numbers" are a first approximation to the 
eigenvalues of the Schródinger operator in the so-called 


semiclassical limit. However, precise mathematical 
results on the subject were obtained only in the 1980s 
and a good understanding of the problem has been 
achieved only recently. In particular it is now clear how 
to compute higher-order corrections to the eigenvalues: 
this is done through suitable normal form procedures. 

In the present article we will discuss the above 
questions for the case of perturbed harmonic 
oscillators, a case which, on the one hand, is 
physically relevant and, on the other, is well under- 
stood. We will only briefly discuss the quantization 
of perturbations of integrable systems. 


A Statement 


On L*(R"), consider the Schrödinger operator 


| b 
H———A-V [1] 

2 
where A is the n-dimensional Laplacian and V is a 
smooth real potential having an absolute nonde- 


generate minimum at the origin. We are interested in 


the eigenvalues of [1] close to zero. Introduce 
coordinates adapted to the normal modes, namely 
such that 


Assume 


(H1) Nonresonance: There exist y > 0 and 7 € R 
such that, for any k € Z” — {0} one has 


wk ar 2] 


(H2) V(x) > 0 for x Z 0, and 
lim inf V(x) » 0 


|x| +00 


(H3) V € C*(R") and for any r > 0 there exists C, 
such that 


alely 
Ox? 


| < Cala)” Va € N" 


where we used the notation (x) :— (1 + [|x||?)! ^. 


Theorem 1 Assume that (H1)-(H3) bold. Then, for 
any positive N, M tbere exist positive constants 
hn Ms EN.M, CN Mo CR. M and a smooth function 


such that, V0 < e < en,m and 0 « b € bw. we, the 
eigenvalues of |1] in [0, e€) have the representation 


Ap =(k ti) - wh + Zn m((R + 4)b:b) 
T Rym(k.b), REN”, k; a [3] 


where 


€ 


| aë 
Rua (R, 5)| < CL, ye + Chal ) 


More precisely, for any k € N” such that 
(k +4) -wh + Zn m((k +4)b;h) € [0, €) [4| 


there exists an eigenvalue A, € [0,«) for which [3] 
holds, and vice versa, for any eigenvalue in [0, c) 
there exists a k satisfying |3] and [4]. The function 
ZN.MUij,...,1,50) coincides with the classical 
Birkhoff normal form of the system computed up 
to order N. 


The proof of the theorem is constructive, in the 
sense that it provides an algorithm allowing to 
construct explicitly, by elementary operations, the 
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function Zy.m. One could choose e= e(þ) — b^ with 
some positive ó < 1, obtaining a simpler statement 
valid for the eigenvalues in [0, 5^). It is also possible 
to weaken the nonresonance condition (H1) to the 
condition w-k £0 for k € Z” — {0}. 

A theorem very close to [1] was proved by 
Sjostrand (1992) by a method different from the 
one that will be presented here (see also Graffi and 
Paul (1987)). In the analytic or Gevrey case (recall 
that a C* function f(x) is Gevrey in some domain if 
there exist constants C,c such that, for all multi- 
indexes a € N” one has 


olf 


< la| y! c 
Ox"? o Meg 


in the whole domain), the error can be reduced to be 
exponentially small with the parameters (Bambusi 
et al. 1999). Previous results dealing with compact 
perturbations of the harmonic oscillator were 
obtained by Bellissard and Vittot (1990). It is 
possible to deal also with the resonant case in 
which (H1) is violated. In this case the spectrum of 
the complete system is qualitatively different from 
the spectrum of the harmonic one. As discussed 
later, the normal form allows one to compute the 
main qualitative differences. 


Birkhoff Normal Form 


In this section we recall the procedure leading to 
classical Birkhoff normal form, whose quantization 
leads to the proof of Theorem 5. 


Birkhoff's Theorem 


The operator [1] is the quantization of the classical 
Hamiltonian 


n E? 
2,3 V() [5] 
p=] 
Denote 
= E + wx? 
x)= 9 aub, G= 
Ho(&.x) » je [6] 


then we have 


Theorem 2 For any positive integer N > 2 there 
exist a neighborhood Uyn of the origin and a 
canonical transformation Tyx:R2" > Un > R” 
which puts the system [5] in Birkhoff normal form 
up to order N, namely such tbat 


HoTy 2 Hg-- Z^ + Ru [7] 
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where ZN Poisson-commutes with Ho, namely 


(Ho; ZN} = 0 and Ry is small, that is, 
IR (6, x)| € Cul, 29] i8 
Moreover, if the frequencies are nonresonant, namely 


wk#0, Vke Z" MO) [9] 


the function ZN depends on the actions I; only. We 
recall that the Poisson bracket of two functions f 
and g is defined by 


1. „(of Og Of OgN — 
{fg} := Lii tele) = —{g; f) 


and coincides with the Lie derivative of g with 
respect to the Hamiltonian vector field of f. 


Remark 1 In the case where the frequencies fulfill 
(H1) and the potential V is analytic (or of Gevrey 
class) the remainder can be reduced to be exponen- 
tially small with ||(£, x)||. 


Scheme of the Proof 


Make the rescaling E= e£', x =ex’. In terms of the 
primed variables, the Hamiltonian of the system [5] 
takes the form 


HE, x!) = HoE, x’) +eW(x’) [10] 
with 
Wis) = i € i wi (xj) /2 
= W3(x’) + eWa(x) +- [11] 


and W; is the Taylor polynomial of order / of V. In 
what follows we will omit primes from the scaled 
variables. 

Given an auxiliary Hamiltonian x5, denote by 4? 
the flow of the corresponding Hamiltonian vector 
field. We construct y3 so that H, o 4? is in normal 
form up to order c&. 


Remark 2 Given a C* function g one has go $? ~ 
So egn with 


1 
80 := £i si = 7 0:845; I> 1 [12] 


where ~ denotes the fact that the left-hand side is 
asymptotic to the right-hand side (a precise defini- 
tion appears later in the article). If both g and x3 are 
analytic then the series of go 6? can be shown to 
converge in a neighborhood of the origin. Using [12] 
to compute H, o $?, we get 


H, o ©? = Ho + (Ws + (xai Ho}] + O(€) 


So H, o ® is in normal form up to O(c) provided 
x3 fulfills the so-called homological equation: 


W3 + 1x3; Ho] = Z3 [13] 


where the unknown function Z3 has to be in normal 
form. Note that, since the operator 
x b Ho} 


maps linearly polynomials of degree / into poly- 
nomials of degree /, eqn [13] can be interpreted 
as a linear equation in the finite-dimensional space 
of polynomials of degree 3 in the phase-space 
variables. 


Lemma 1 The homological equation |13] admits a 
solution (x3, Z3). 


Proof Introduce the canonical coordinates (C, n) by 


"EXE ix 
sm Ae va] 


[14] 
E) 


Ta = Ši 
] iE / 9j 


In these variables the unperturbed Hamiltonian Ho 
reads Ho = 5 /,., iwjGjin; and W3 is transformed in a 
different polynomial, again of third order. 
The important fact is that in these coordinates the 
eigenvectors of the linear operator {Ho;.} are the 
monomials 


Cg neque 
Indeed, one has {Ho;C*7/}=iw-(k—I)C*r!. As a 


consequence, writing 
kt 
=S City 
k,l 


one can define the resonant set 
R= ((R,D): w (k — 1) 2 0! 


and 


k Ic R '15] 
l CH i 
(6m) b» iw (k — D^ 


Going back to the original variables, one has the 
solution of the homological equation. O 


Definition 1 The function Z3 solving [13] will be 
called the resonant part of W3 and will be denoted 
by (W3). 


Using the function y3, one can transform the 
Hamiltonian to the form 


Hg + «Z3 +R; 


Remark 3 Equation [12] allows to construct 
directly the Taylor expansion of R3 in terms of the 
Taylor expansion of W and of its Poisson brackets 
with X3- 


Iterating the construction (which however slightly 
changes due to the presence of Z3), one gets the 
proof of Theorem 2. 


Remark 4 In the nonresonant case w-(k —/)=0 
implies that k =l; therefore, the resonant part of a 
polynomial is the sum of monomials of the form 


k 
Cat = Ty TP 


that is, it is a function of the actions only. Moreover, 
in this case one has Z3 — 0, while in general Z4 Æ 0. 


Some Symbolic Calculus 


To understand how to quantize the procedure of 
Birkhoff normal form, we consider the classical- 
quantum correspondence. It is well known that 
there are different procedures in order to associate 
an operator with a classical observable. Here we 
concentrate on the Weyl quantization rule. 

To a function f € S(R^") (Schwartz class), we 
associate an operator f acting on functions W% € 


S(R"), which is defined by 


a m 1 l x+y 
PV): BaF M ( 2 £) 


i(x—y)-€ 


xe ? wy) dy dé [16] 


Definition 2 The operator [16] is called the Weyl 
quantization of f and in turn f is called the symbol of f. 


Using the method of oscillatory integrals, the 
Weyl quantization rule can be extended to much 
more general observables f. We recall that, roughly 
speaking, the method of oscillatory integrals consists 
in giving meaning to a formal expression of the form 
[16] by using successive integration by parts (see, 
e.g., Martinez (2001)). 


Definition 3 A function f € C*(R^") will be called 
a smooth symbol of class S((z)") if, for any r > 0, 
there exists C, such that 


gle! 
— f Va e N2” 
Oz 


(z) £ Cuz)", 


Where (z) is as defined earlier. 
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It is useful to extend such a definition to functions 
explicitly depending also on 5. This can be done in a 
straightforward way by asking the constants C, to 
be independent of 5 in a neighborhood of the origin. 
Different classes of symbols can also be defined, but 
for our purpose this class is enough. 


Theorem 3 Let f € S((z)"), m € R, and v € S(R”"); 
then the formal expression [16] is a well-defined 
oscillatory integral. 


Example 1 Under Weyl quantization rule, one has 


£ —ibÓ,, x;—x; (multiplication operator) 
6jxj = 1 (£&; + j£) 
Definition 4 A sequence (fj);59 with f; € S((z)") 
will be called the asymptotic expansion of f € 


S((z)") if, for any integer N, there exist two positive 
constants Cy, byn such that 


N » 
Tc » Ef + Ryn 
j=0 


with |Rx(z, b)| < Cub" +! (z)", and b € (0, by). 


The key point for the quantization of the normal 
form procedure is the following. 


Theorem 4 Let f € S((z)"') and g € S((z) ^); then 
there exists a unique F € S((z)" *"?) such that 


F = f & (operator product!) 


moreover, one has 


2 
x (f(x, Egy m)) lynx, nae [17] 


Finally, F admits an asymptotic expansion in b 
which coincides with the formal expansion of |17]. 


F = exp (2 (x 1 oN — Oy i a) 


The proof is obtained by using eqn [16] to 
write down an expression for f$ and obtain a 
formula for F. Then, one shows that the formula 
is well defined and therefore the result is not 
formal. 


Definition § In the above context, the symbol G of 


if] -e 


will be called the *Moyal bracket" of f and g and 
will be denoted by (f; gl. 


By formula [17], one has in particular 


(f;ig)ju = (fig) - Afg) + O) — [18] 
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where 
x (f 7 s Of Peg Of Og 
IW) = 0&3 0x3. BE2Ax 0x10€ 
- Og E Pf 0?g 
O£Ox?OxO£? Ox? OE? 


where we used a vector notation for the derivatives. 
If either f or g are polynomials of degree <2, then 


{fsghu = thigh [19] 


Given a self-adjoint operator A and a smooth 
function G:R— R, it is well known how to 
define by spectral theorem the operator G(A). 
Suppose now that A= f for some symbol f. In 
general, one has G(f) Z Gof. However, by sym- 
bolic calculus (i.e., using eqn [17]), one has: 


Lemma 2 Denote I;(x, €) = (wx? + 6€)/2uj. Then, 
for any positive integer k there exists a function 
F,(1j,h) such that 


Ee 


(1))^ = Fe (Ij, b) 


where the right-hand side is defined by spectral 
calculus. Moreover, F, can be computed explicitly 
by | the recursion formula Fp, =1)Fp+ 
F, 4b (k? — k + 1)/4. 


As a consequence of this fact and of the fact that 
[Î;, Îi] 2 0, one has that the Weyl quantization of a 
polynomial function of the actions is a function of 
the action operators. 


Semiclassical Normal Form 


Let x be a smooth symbol such that X is self-adjoint, 
and consider the group of unitary operators 
Xe := exp ((ie/h)x). Let g be a smooth symbol; 
apply the unitary transformation X, to g, namely 
compute X,2X;!. Noting that (on a suitable domain) 


d 
de 


one has (formally!) the expansion of X, eX; 


(X.gX;!) = X. d -[X: gx; ! 


XX. =Y eg 
[0 
where 
P . a Ti 
CUT B= T5 UG £5.11] 1 21 [20] 


(Such a series can be interpreted as an asymptotic 
expansion provided one restricts the domain at each 


step of the approximation.) Equivalently, the symbol 
of XX7 is formally given by $7, eg, , with 


£40:—£, Bgd = zta atij ! 21 024] 
from which one sees a remarkable similitude with 
the classical equation. Moreover, [21] converges to 
[12] when 5 — 0. 

Applying the unitary transformation generated by 
X to the Hamiltonian operator H. (cf. eqn [10]), one 
has X, If, X71 = =H} with 


Hi = Ho + e| W3 T Uc Holy] T Ole) [22] 


= Ho + «| W3 + b Ho) + Ole) [23] 


where we used the fact that Ho is a quadratic 
polynomial, so that [19] holds. It is thus clear that 
Lemma 1 allows to solve also the quantum homo- 
logical equation appearing in this context and to 
determine the symbol of the operator generating the 
unitary transformation putting the Hamiltonian opera- 
tor in normal form up to corrections of order e. 
Moreover, one can compute in terms of Moyal 
brackets (of polynomials!) the expansion of the symbol 
of the new remainder and of the normal form. Iterating 
the construction, one generates a well-defined semi- 
classical normal form of the quantum system. 


Example 2 Denote by Z,),/=1,2..., the term 
added to the semiclassical normal form at the /th 
step of the iterative construction. Explicitly, the first 
terms are given by 


Zai = (W3) = Za [24] 
Zq,2 = (Wa) - 50x35 Ws]u) -3(0xssZs]u) [25] 
Z4.3 = (Ws) + ({x4;Z3}xy) +3({x3; Haly) 


+5({x33 Wald) + ({x3; Wal) [26] 


where, according to Definition 1, (.) is the resonant 
part of its argument, x; is (formally) the symbol of 
the operator generating the jth unitary transforma- 
tion, and 
EN TM 
H3 = 51x33 Z3 — 


Why, W31:= lxx; Wa] 


Note that all the Moyal brackets involved contain 
polynomials of degree at most 4, so that they can be 
computed exactly using formcila [18] which in this 
case does not contain corrections of order 5*. 


The problem in making previous construction 
rigorous is that all the series involved are in general 


divergent. Moreover, it is not possible to show that 
the remainders appearing when truncating such 
series are small in a reasonable sense. Nevertheless, 
it is possible, using the tools of microlocal analysis, 
to show that the semiclassical normal form contains 
essentially all the information on the part of the 
spectrum close to zero. 

The precise relation. between the spectrum of 
the original Hamiltonian and the spectrum of the 
semiclassical normal form is captured by the 
following definition. 

Let Hi(e, b), H»(c, b) be two families of self-adjoint 
operators; set Spec, (H4. ») :— Spec(Hi ») A [0, e). 


Definition 6 We say that 
Spec (H1) = Spec, (H2) mod(&* + (b/e)) 


if for any N, M > 0 there exist Ch m and Cx, m such 
that for any 2A; € Spec,(H;) there exists Az € 
Spec, (H5) such that Ay = A2 + Ry, with 


IRn| < CN. Me" T Chin [27] 


and conversely. Equation [27] has to hold for any 
couple (5, €) with e and (5/«) small enough. 


Theorem 5 Assume (H2) and (H3); assume also: 
(H1') There exist ^; > 0 and 7 € R such that, for any 
k € Z”, one has 


Ñi 


either o- k =0_ or|w-k| > — 
ZR 


28] 
Then there exists a polynomial function Z, such 
that one has 


^ 


Spec, (H) 
= Spec, (Ho + A mod (o + (*) ) [29] 


The polynomial Z, coincides with the semiclassical 
normal form defined at the beginning of the 
section. 


Scheme of the proof It consists of six steps. 
(1) Make the unitary transformation (Uw)(x):— 
e/tple!/ >x) which transforms the Hamiltonian 
operator [1] into the Weyl quantized of 
cH, :=e(Họ + €^ W), but a Weyl quantization 
where b is substituted by hb’ := b/«. (2) Make a cutoff 
of H,, namely, fix R and consider a smooth function 
t such that t(s) = 1 for |s| € R, t(s) = 0 for |s| > 2R, 
define a(x,£) := W(x)t(||(€,x)||). (3) Compare the 
spectrum of the Hamiltonian H, with the spectrum 
of H':— Ho + ea. By microlocal analysis, one has 
that, in any fixed bounded interval such spectra 
coincide modulo 5b" (see, e.g., Martinez (2001)). 
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(4) Rescale back the variables, namely apply the 
transformation U^! to H'. (5) Apply the normal 
form algorithm to the so-obtained Hamiltonian 
showing that all the series involved are convergent 
in suitable norms. (6) Use again microlocal analysis to 
show that the spectrum of the semiclassical normal 
form coincides with the spectrum of the normalized 
operator with compactly supported symbol. o 


Remark 5 Fix an arbitrary 1 > 6 > 0 and link e€ to 
b by e :=b°. Then one obtains a simplified statement 
according to which the spectrum of [1] in [0, b] 
coincides modulo 5^ with the spectrum of Hy + Za 
in the same interval. 


Remark 6 In the case where the frequencies are 
nonresonant one has that the symbol of the normal 
form depends on the actions only. By Lemma 2 one 
has that also the quantization of the normal form is 
a function of the action operators only (explicitly 
computable), and therefore the spectrum of the 
normal form is given by a quantization formula as 
claimed in Theorem 1. 


The Resonant Case 


In the case where the frequencies are nonresonant, 
due to the particular structure of the normal form, 
one obtains a very precise information on the 
spectrum. In the case where there are some 
resonances, the situation is more difficult. In order 
to illustrate what happens we concentrate on the 
completely resonant case, that is, the case where all 
the frequencies are integer multiples of a single 
fundamental frequency v. 

In this case, the eigenvalues of Hy form a subset of 
Nbv + (1/2)|w|b5 and are degenerate. One expects the 
nonlinear part to break such a degeneracy and to 
transform each eigenvalue in a small band. One can use 
the normal form to study the structure of the so- 
obtained band. To this end, the most relevant contribu- 
tion is due to the first nonvanishing term of the normal 
form. For the sake of definiteness, we assume that this is 
the term of order 4, namely Z4. Denote 


N = Zau) B(E) = [E — wh, E + 4vb] 


Theorem 6 Fix 1 » ^, > 1/2, then, provided b is 
small enough, one has 


Spe(H)n(b,b")c |) BE) — [30 
EeSpec(Ho) 
Moreover, denote by 


E+ AA(E, b) < --- < E+ ACE, b) [31] 
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the eigenvalues of H in B(E) counted with multi- 
plicity, then 


AY(E, b) = E^Min N + E?(O(b/E) + O(E!2)) [32] 
and similarly 
E” (E, b) = Max N+ E?(O(b/E) + O(EV?)) [33] 


This statement is due to Bambusi, Charles, and 
Tagliaferro (see Bambusi 2004); for previous results, 
see Vü Ngoc (1998). 

Equation [30] shows that the spectrum has a band 
structure, while eqns [32] and [33] allow one to 
compute the minimum and the maximum of each band. 

The idea of the proof is as follows. First forget high- 
order terms of the normal form, whose effect is included 
in the error terms. Then, due to the commutation 
property of the normal form with Ho, one has that Z4 
restricts to an operator acting on the eigenspaces of Ho. 
On the classical side, one has that by Marsden- 
Weinstein procedure Z4 defines a classical Hamiltonian 
system on the manifold obtained by symplectic reduc- 
tion of the original phase space. By the methods of 
geometric quantization, it turns out that the quantum 
operator acting on an eigenspace of Ho is a Toeplitz 
operator whose principal symbol is exactly the above 
reduced classical Hamiltonian. Then, the proof follows 
by classical properties of Toeplitz operators. 

We point out that results of this kind are useful in 
the computation of the molecular spectra (Michel 
and Zhilinskii 2001, Zhilinskii 2001). 


Quantization of KAM Tori 


In this section we present a result on the quantiza- 
tion of KAM tori. It allows one to construct part of 
the spectrum of a close-to-integrable system. 

We recall that a classical Hamiltonian system with 7 
degrees of freedom is said to be integrable if it has n 
integrals of motion independent and in involution. If the 
energy surface is compact, then, by Arnol'd-Liouville 
theorem there exists a canonical transformation To: 
R” x T" 5 D x T" — R” introducing action-angle 
variables, namely such that, denoting by Ko the original 
integrable Hamiltonian, Ko o 7 is independent of the 
angles @ € T”. Here, D is an open bounded domain. 

Consider now a close-to-integrable analytic 
Hamiltonian system, namely a Hamiltonian system 
with Hamiltonian 


K = Ko 4- Kı 


where c is a small parameter. We assume that, 
denoting again by 7$ the canonical transformation 
introducing action-angle variables for the system Ko, 


one has that both KooTọo and K4979 are real 
analytic on D x T”. Then, the KAM theory applies. 
To state the corresponding result, denote by Do C D 
a domain whose closure is contained in D. 


Theorem 7 Assume that VI € D one bas 
(Ky oT 
diat cu) +0 [34] 


then there exists a positive constant e, and, for any e 
with |e| <e, there exists a Gevrey canonical 
transformation T.:Do x T” — R” and a Cantor 
set D, C Do with the following properties: 


KoT: = Z(I) + R(I, ó, e) [35] 


where R(I, à, e) vanishes at infinite order on D« that is, 
for any multi-index a there exists Ca such that one has 
O'R 


a, ay € 


with a suitable p» 0 and ||— D,| denoting the 
distance from D.. Moreover, as « tends to zero, the 
measure of D, tends to the measure of Do. 


c 
€ Cj exp (- ror) [36] 


A particular consequence is that the set D, is 
foliated in invariant tori. From the proof, it also 
turns out that the motion on each torus is 
quasiperiodic with frequencies fulfilling the assump- 
tion (H1) stated earlier. Moreover, the tori are 
linearly stable and even more: they are stable in an 
exponential sense (namely, a solution starting O(i) 
close to a torus takes at least a time O( exp (c/p/)) to 
double its distance from the torus). 

Quantizing the normalizing transformation 7, by 
using the theory of Fourier integral operators, one 
can also put the quantum Hamiltonian in a suitable 
normal form which allows to deduce some spectral 
information on the system. 

To fix ideas we restrict to the case where K is a 
natural system, namely it has the form (3.1), and is 
close to integrable in the above sense. Fix two 
parameters E, < E»; assume (1) that K! ([—oc, E2 + 
6]) is compact for some positive ó and (2) that the 
domain Do can be constructed in such a way that 
To:Do x T" — Kg ([Eo, E1]) is a bijection and, 
moreover, the KAM condition [34] holds. Denote 
by 6 € Z” the Maslov class of the tori of Ko (see, 
e.g., Lazutkin (1993)) and, having fixed some 0 < 
c < 1, define the set of indexes 


T:={kEZ":|D.—b(k+0/4)| <b} BN 


Theorem 8 There exist positive constants h,, c, C, 
and o <1, and a function K,:Do x (0,5,) > R 
with the following property: for any k €T there 
exists at least one eigenvalue of K in the interval 


Z,(b(k-+0/4),b) — Ce”, 
Z,(b(k+0/4), b) + Cent [38] 


One can also show that a large part of the 
spectrum is constructed in this way. This is obtained 
by comparing the semiclassical estimate of the 
number of eigenvalues in |E,, E2] to the number of 
eigenvalues thus constructed. 

Theorem 8 is due to Popov (2000); the quantiza- 
tion of KAM tori was initiated by Lazutkin and 
widely developed by Colin de Verdiére, who obtained 
a result similar to Theorem 8 for the case where K is 
C* and describes the geodesic flow on a compact 
Riemannian manifold (Colin de Verdiére 1977). 


See also: Central Manifolds, Normal Forms; 
h-Pseudodifferential Operators and Applications; Optical 
Caustics; Quantum Mechanics: Foundations; 
Schródinger Operators; Stationary Phase Approximation. 
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Introduction 


The present article relies heavily on Quantum 
Mechanical Scattering Theory in this Encyclopedia 
and can be considered as its continuation. We use 
here freely the notation and results discussed in this 
article. 

An important problem of scattering theory con- 
cerns the Schródinger H operator of N, N > 3, 
interacting particles. Since the potential energy of 
pair interactions between particles depends on their 
relative positions only, it does not tend to zero at 
infinity in the configuration space of a system, even 
if the center-of-mass motion is removed. This is 
qualitatively. different from the two-particle case. 
It turns out that asymptotically (for large times 
t —^ --oo or t —^ —oo) an N-particle system splits up 
into clusters, 


Cis eges T. uus N}, C,nCj5Q0ifRzl [1 
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Particles from the same cluster C,, k= 1,...,7, form 
a bound state, and different clusters do not interact 
with each other. In particular, if m=1 and 
C,={1,2,...,N}, then we have a bound state of 
the system. In another extreme case n=N, all 
particles are free. The asymptotic evolution deter- 
mined by clusters C4,..., C, where n > 2, and bound 
states of all these clusters is called a scattering 
channel. Physically it is natural to expect that the list 
of all such channels is exhaustive, that is, no other 
scattering process is possible. This statement 
is called asymptotic completeness. 

We emphasize that an N-particle system may be in 
different scattering states as t — 4-oo and t — —oo and 
different rearrangement processes are possible. For 
example, a three-particle system may asymptotically 
consist of free particles or a pair of particles may be in 
a bound state, whereas the third particle may be 
asymptotically free. If particles are free at both —oc 
and --oc, then one speaks about elastic scattering; we 
have a capture if particles free at —oo form a bound 
state of a couple after the interaction; an opposite 
process, when a bound state at —oo gives three free 
particles, is known as a breakup. It is also possible that 
a bound state of one couple yields a bound state of 
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another pair (a rearrangement) or a bound state of a 
couple transforms into another bound state of the 
same couple (an excitation). All these processes are 
described by the scattering operator. On the contrary, 
if the whole system forms a bound state at —oo (i.e., 
n — 1), then it remains in the same state for all t. 

As far as monographic literature on N-particle 
scattering is concerned, we mention Derezinski and 
Gérard (1997), Faddeev (1965), Reed and Simon 
(1979), and Yafaev (2000). 


Setting the Scattering Problem 


Let us recall the definition of the N-particle 
Schródinger operator (Hamiltonian) 


H = Ho +V [2] 


If the configuration space of each particle is R?, then 
the operator H acts in the space L;(RZ"), The operator 
of kinetic energy (the *unperturbed" Hamiltonian) is 


N 


Ho = — 3 (2mj) A [3] 


j=1 


where x; and 7n; are the position and mass of the 
particle labeled by j. The operator of potential energy 
of pair interactions of particles (the perturbation) V is 
the operator of multiplication by the function 


V(x) = M V? (x; — xi), ij—1,..., N [4| 


i<j 


Set o-—(ij) x^ — x; — xj. It is assumed that the 
functions V^(x^) tend to zero sufficiently rapidly as 
Ix^| ^oc in R?. However, the function V(x) 0 
as |x| oo in RN if at least one of the distances 
|x; —x;| between particles remains bounded. This 
difficulty is manifest even for two particles (N — 2), 
but in this case it disappears if the motion of the 
center of mass of the system is removed. 

This means the following. Let the subspace X*" of 
R^N be distinguished by the condition 


N 
X myx; = [) [5] 
j=l 


and let Xem be the orthogonal complement to X™ in 
the space RN endowed with the scalar product 


N 
it, yb =2 a dpe [6] 
j-1 


Then 
L;(R*N) = L2( Xon) o9 L(x") 


Denote by Xem, x*" the orthogonal projections of 
x € RN on the subspaces Xem, X™, respectively, so 
that x= (žem T) Clearly, the vector žem has 
components 


N N 
Xcm ~= M^! NE M = Nom; 
j=l j-1 
Let T(p), (T (p)yf )(x3, . . S XN) = f tb... +3 XN + p 
be the operator of common translations of particles. 


The operator H commutes with T(p), that is, 
T(p)H — HT(p), for all p € R7. It follows that 


H=K®I+I@H, K--(2MY'A,. [7] 


where K is the kinetic energy operator of the center- 
of-mass motion. 
The operator 


H = Ho + V [8] 


acts in the space H = L;(X*"). Here V is again the 
operator of multiplication by function [4]. The 
precise form of the differential operator Ho depends 
on the choice of coordinates in X*", For example, if 
N=2 and x 2x; — xi, then Ho = — (2:1) ! A, where 
m=m;m (m; +m). In the case N —3, a natural 
choice of coordinates in X*" is given by one of the 
three sets of Jacobi variables: 


12 
X —X2— X] 


x12 = xs — (mı + mj) | (mıxı + mx2) 


and similarly for x'?, x13 and x”, x23. In coordinates 


x^,x, the operator of kinetic energy is determined 
by the formula 


Ho = —(2m,) ! A,, — (22m^) As. 


t 


where, for example, 
m, = (m, +m) | mil 
12 l 2 3 


If N —2, then V(x) — 0 as |x| ^ oo,x € X™, but this 
is no longer true for N > 3. According to eqn [7] the 
spectral and scattering theories for the operator H 
reduce to those for the operator H. However, for 
N > 3, this reduction is not really helpful. 

Let us now consider a breakup a-[Ci,...,C,] 
of an N-particle system into clusters C;,...,C,, 
1 < n=: #(a) <N satisfying conditions [1]. If 
interactions between different clusters are neglected, 
we obtain the operator 


-1 _ yl -1 
=m, rU, 


=e, Wy ye [9] 


l=1 a€C, 


In particular, H; — Ho if #(a)=N and H,—H if 
#(a)=1. Let the operator of common translations 


of particles from the same cluster be defined by the 
equation 


(Talpa «s ial ts «o NI) = gc TM — 


where x;—x;-- p, if j € Cj. The operator H, com- 
mutes with the operators T,;(pi,...,p,) for all 
VectOrs Pts: D» c R*, Let the subspace X^ be 
determined by the condition 


3 mjxj = 0, i A 


J€C, 


and let X, be the orthogonal complement to X^ in 
X*" with respect to scalar product [6]. Clearly, 
dim X^ =(N — #(a))d, dim X, = (#(a) — 1)d. Then 
the space 71 splits into the tensor product 


La( X^") = Lo Xe ) B L5 X“) [10] 


In what follows, x; and x^ are the orthogonal 
projections of x € X*" on the subspaces X, and 
X^, respectively. The “external” variable x,— 
(xX1,X2,...,X4), where 


xi - Mi! Y mixj, M, => m; 


j€C, J€C, 


describes positions of centers of masses of the clusters. 
The “internal” variable x^ is the set of numbers x; — x; 
for all ; € C; and all l= 1,...,7. Of course, for each / 
only |C;| — 1 (|Cj| is the number of particles in a cluster 
Cj) of variables x; — x; are independent. Set 


n 


Ka - —As, —— N 2M) ‘Ax 
[=| 


and 
H? = —Ayu + V° 
Then 
H, = K, QI FIGH" 


Note that eigenvalues A^" of the operator H^ are 
sums over l= 1,...,7 of eigenvalues of the operators 


H(Ci) = Ho(Ci) + $ V^ 


acc, 


describing each cluster. Similarly, eigenfunctions v^" of 
H’ are products of eigenfunctions of these operators. 
We usually write a instead of a couple {a,n}. In the 
following, the index a labels all cluster decompositions 
with #(a) > 2. The eigenvalues A? of the operators H° 
(M —0 if #(a)=N) are called thresholds of the 
Schródinger operator [8]. If all functions V^(x^) — 0 
as |x| — oo, then the essential spectrum of the operator 
H consists of the interval [Ao, oc), where 


Ao = min A^? 
a 
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(the Hunziker-Van Winter-Zhislin theorem). More- 
over, the eigenvalues of the operator H may 
accumulate at its thresholds only. 

The fundamental result of scattering theory for 
the N-particle Schródinger operator can be formu- 
lated as follows. Let P^ be the orthogonal projection 
in L2(X*) on the subspace ^ spanned by all 
eigenvectors 1^" of H?, and let P, =I & P^, where 
the tensor product is defined by eqn [10]. Then P, 
commutes with the operator H,. Set also Ko = Ho, 
Po — I. Suppose that for all a 


V(x) < CA +e, p>1 [pj 


(the short-range assumption). Then, for all a, the 
wave operators 


We = W (A, Ha Py) = slime e ap, 


t—too 


exist and are isometric on the ranges RanP, of 
projections P,. The subspaces Ran W* are mutually 
orthogonal, and scattering is asymptotically complete: 


e Ran WF = 4*9 
a 


The singular continuous spectrum of H is empty, so 
the absolutely continuous subspace H° of the 
operator H can be replaced by HGH"), where 
H”? is spanned by all eigenvectors of H. 

These results can be reformulated in terms 
of scattering theory in a couple of spaces. 
Suppose that, for every a, eigenvectors w^" are 
normalized and orthogonal if the corresponding 
eigenvalues A^" coincide. Let us introduce an 
auxiliary space 


4-3. W.-9.—-L(X) M2 


and an auxiliary operator 


H-OBK, K,-K, ^ [13] 
a 


in this space. Here and below, the sums are taken 
over all a. We define an identification J: H — H by 
the relations 


J en » J"fa — fa & y [14] 


where the tensor product is the same as in [10]. In 
particular, J? =I. Since H;J^—]"K,, the wave 
operators W-^(H,H;J) exist and are isometric and 
complete, that is, 


Ran W*(H, H; J) =H 


Thus, for states orthogonal to eigenvectors of 
H, evolution of an N-particle system decomposes 
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asymptotically into a sum of evolutions which 
are "free" in external variables x, and are 
determined by eigenvalues and eigenfunctions of 
the Hamiltonians H^ in internal variables x^. To be 
more precise, we have that, for all f € H and 
t — +0, 


exp(—iHt)f = X exp(—iKat)fe &wv^--o(1) [15] 


where 
f; = WLR YT 


and the term o(1) tends to zero in H. The wave 
operator W-(H,K;;J^) describes the scattering 
channel where a system of N interacting particles 
splits up asymptotically (for £— +00) into non- 
interacting clusters C1,...,C,, n > 2, and particles 
from the same cluster C; are in the bound state (if 
there are more than one particle in Cj) given by the 
function v^(x^). Somewhat loosely speaking, this 
implies that the continuous spectrum of the 
operator H consists of branches starting from all 
its thresholds. 

Note that the scattering problem can equivalently 
be formulated without the separation of center- 
of-mass motion. In this case, a trivial decomposition 
with #(a)=1 should be added, and the set of 
thresholds of the operator H includes eigenvalues of 
the operator H. 

The existence of the wave operators and their 
isometricity can be obtained by the Cook method. 
Only the asymptotic completeness is a difficult 
mathematical problem. It can be solved within the 
framework of the smooth method, which requires a 
study of boundary values of resolvents as the 
spectral parameter z approaches the continuous 
spectrum or, equivalently, a study of a large-time 
behavior of evolution operators. 

The scattering operator 


S = W*(H,Ét;]) W- (H, B; J) 


is unitary on the space H and commutes with the 
operator H. Its component Sap: Hp — Ha describes 
a process where a system in a state b as t — —oo 
goes over in a state a as t— --oo. Diagonalizing 
the operator H by a unitary operator 
F,(FHf\(A)= AXPf)A,A»Aoe, we obtain the 
scattering matrix S(A) defined by the equation 
(FSf)(A) =S(A)(Ff)(A). In its turn, the scattering 
matrix is also a matrix operator with components 
Sap(A). For N > 3, the structure of the scattering 
matrix is essentially more complicated than for 
N — 2. This is discussed in some detail in the next 
section. 


Resolvent Equations for Three-Particle 
Systems 


Let the Hamiltonian H be defined by eqns [2]-[4], 
where N — 3, and let the configuration space of each 
particle be R^, d > 3. The operator H acts in the 
space 7?( — L;(X*"), where the subspace X*" C R*4 
is distinguished by condition [5]. Let Ro(z)= 
(Ho — z) |,R(z) -(H —z) ^. Since V(x) does not 
tend to 0 as |x| —^oo,x € X™, in the three-particle 
case, the resolvent equation 


R(z) = Ro(z) — Ro(z) VR(z) [16] 


is not Fredholm even for Imz Æ 0. 

To overcome this difficulty, Faddeev (1965) 
derived a system of equations for components of 
the resolvent. The entries of this system are 
constructed in terms of three Hamiltonians 


Ha = Ho V^ 


a — (12), (13), (23), containing only one pair inter- 
action each, and their resolvents R,(z) =(H, — z) |. 
Let us write down the resolvent equation for each 
pair H4, H 


R(z) = Ra(z) — Ra(z) * . V^R(z) 
pBxa 


We multiply it by yat- and set 
rako = |V^|" ^ R(z) 
ta 3(z) M |Ve|!?R, (z)( v5)? 


where (Vê)? = valve, This yields a system of 
equations 


ralz) = |V^| ^ R«(z), 


C 


Eo a(Z) — 0, 


ralz) = ro (z) = ta,3(Z)ra(z) [17] 


Ba 


for the operators r,(z). Note that the resolvent R(z) 
can be recovered from its components r,(z) by the 
formula 


R(z) = Ro(z) — Ro(z) X (V^) ^r, (a) 


[3 


It is convenient to rewrite eqn [17] in the matrix 
notation 


r(z) = r” (z) — t(z)r(z) [18] 


where r°(z) = {r°(z)}, r(z) ={ra(z)} are the “vector” 
operators in the three-component space pP orem) and 
t(z) — [t 5(z)) is the “matrix” operator in this space. 

The advantage of eqn [17] compared to [16] is 
that the operators £4 3(z) are compact for Imz Z 0. 
This can be deduced from the fact that the product 
V?(x^)V?(x^) where af tends to O as 


|Ix|—0o,x € X™, provided that V*(x^)—0 as 
Iix^| —^ oo for all a. Moreover, the homogeneous 
equation [17] has only a trivial solution. Indeed, if 
for some z with Im z 4 0 


f. — — 9 taal2)fp [19] 


Ba 


then the function 


u=} (Vo) fas 


Q 


satisfies the equation u= —Ro(z)Vu. Since the 
operator H is self-adjoint, this implies that u =Q 
and hence fa=0 for all o. According to the 
Fredholm alternative, eqns [17] for r,(z) or [18] 
for r(z) can be solved if Im z Z 0, that is, 


r(z) = (I + t(z))*r°(z) [20] 


This equation allows one to deduce the existence of 
necessary boundary values of the “sandwiched” 
resolvent R(z) from similar results for the resolvents 
R,(z) of the “two-particle” operators H,. In its 
turn, R,,(z) can be expressed in terms of the resolvent 
R°(z) of the operator H* acting in the space L5(R?). 
Indeed, in the “mixed” representation (£,, x^), where 
the Fourier transform in the variable xa is performed 
and the variable £, is dual to xa, we have 


(Ra(z)f) (Ea, x?) = (R^(z — (2a) EP) 
x (Ea, x?) [21] 


The passage to the limit Imz— 0 requires that 
assumption |11] be satisfied for p » 2. Moreover, 
we have to suppose that the operators H^ do not 
have the so-called zero-energy resonances as well as 
eigenvalues embedded in the continuous spectrum. 
Then the operator functions (x^) 'R^(z ja "E dz d 
(x?) 2 (1-- |x?|^)!7, are analytic in the couples 
plane cut along [0,oc), they, have poles only at the 
points A^", and are continuous up to the cut, the 
point z=0 included. In particular, it follows from 
eqn [21] that, if the operators H^ do not have 
negative eigenvalues, then the operator functions 
(xm 'Ra(z) (x! 7 [ 31, are also analytic in the 
complex plane cut alone [0, oc) and are continuous 
up to the cut. 

The next result is of genuinely three-particle nature 
and is crucial for the study of the operator t(z). The 
operator functions (x?) "Ro(z)(x?) "a 4 B,1 1, 
are continuous in norm up to the cut along [O, oc). 

Now it follows from eqn [20] that the operator- 
valued functions r,(z)|V*| !/? are continuous up to 
the cut (0,00) except points A € (0,00), where the 
homogeneous equation [19] for z=A+i0 has a 
nontrivial solution. The set V —, UA of such 
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points A € (0, 0c) is closed and has Lebesgue measure 
zero. In particular, the operators (x^) ,15» 1, 
are H-smooth on any compact arsa of 
A= (0,00)\N. Therefore, the smooth method of 
scattering theory can be directly applied. It yields 
the existence and completeness of the wave 
operators W.(H, Ho). In this case, three-particles 
are necessarily asymptotically free. 

“Two-particle” channels of scattering arise if the 
operators H^ have negative eigenvalues. To simplify 
notation, we assume that every H^ has exactly one 
eigenvalue A^ < 0. Moreover, it is supposed that the 
corresponding eigenfunction :^(x^) tends to zero 
sufficiently rapidly as |x^|— oo. Analytically, the 
appearance of new channels is due to new singula- 
rities of the resolvents. Indeed, in this case 


R° (z) - ix - z) p T R^ (z) 


where the function R^(z) is analytic and continuous 
up to the cut in the complex plane cut along [0, ox). 
It follows from eqn [21] that in this case the 
resolvent R,,(z) contains the additional term 


(3m) 


which is analytic only in the complex plane cut 
along [A^,oc). To take these terms into account, 
system [17] should be further rearranged. This yields 
the following result. Let us set 


Goo = (x?) -Ph Gai (x4) Q2) M V^ pa 
Bxa 


d a. AP =f) e P? 


Then, for all a,@,i,;=0,1, a suitable />1 and 
Ay = min (A?], the operator functions GoiR(z)G3, are 
norm continuous as z approaches the cut (Ag,0o) at 
the points of A=(Ag,co)\M, where V is again a 
closed set of measure zero. [n particular, 
the operators Gao and Ga; are H-smooth on any 
compact subinterval of A. 

In the multichannel case, to fit scattering for the 
Hamiltonian H into the framework of smooth 
theory, it is convenient to reformulate the result in 
terms of scattering theory in a couple of spaces. Let 
the space H, the operator H, and the identification / 
be defined by eqns [12], [13], and [14], respectively, 
where the index a takes four values a=0,a and 
a — (12), (13), (23). One, further, needs to introduce 


auxiliary identifications 
Pol) P. 
Q 


and 


J=Pe@r 
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The H- (and H-) smoothness of operators [22] imply 
that the wave operators 


W^(H.HiJ) and W*(H,H; J") 


exist. 
The operators W^(H, H;J) are isometric because 


pau P, exp(—1Hot) = 0 [23] 
t|—200o0 


and the operators P,P; are compact for o ¥ B. 
Using that the operator 


J)-1- PaP 


ax 


is compact (whereas Ir — ] is not), we see that the 
operators W*(H,H;J') are also isometric. Finally, 
we remark that, by eqn [23], 


W*(H,H:;J) = W*(H,H,]) 


This implies the asymptotic completeness. 

Let us discuss properties of the scattering matrix 
in the one-channel case where the pair operators H^ 
do not have negative eigenvalues. The scattering 
matrix $(A):L4(874 = L5(S?4 1), A> 0, is of 
course a unitary operator, but in contrast to the 
two-particle case the operator $(A)—] is not 
compact because its kernel contains the Dirac 
functions ó(£, — E). Nevertheless, the structure of 
its singularities can be explicitly described. Actually, 
let S,(A) be the “two-particle” scattering matrix for 
the pair Ho, H,. Then 


S(A) = S12(A)S23(A)S13(A)S() 
where the operator S(A) — I is compact. 

The approach described briefly in this section 
relies on a kind of an advanced perturbation theory 
where the free problem is determined by the set of 
all sub-Hamiltonians. Its generalization to the case 
of an arbitrary number of particles meets with 
numerous difficulties. A different, nonperturbative, 
approach which works well for any number of 
particles will be discussed in the next section. 

A purely time-dependent method in three-particle 
scattering 1s exposed in Enss (1983). 


Nonperturbative Approach 


Now N and d are arbitrary. In the nonperturbative 
approach (see Graf (1990), Sigal and Soffer (1989), 
and Yafaev (1993)) the operators H and Hp as well 
as the Hamiltonians of all subsystems are treated on 
an equal basis. It is supposed that all pair potentials 
satisfy condition [11]. No assumptions on subsys- 
tems are required. 


The starting point of this approach is the limiting- 
absorption principle, which claims that the operator 
(x), x Ee X™, for / » 1/2 is H-smooth on any 
compact interval A not containing the thresholds and 
eigenvalues of H. Its proof relies on the Mourre 
commutator method (see Cycon et al. (1987)). To be 
more precise, it is deduced from the following estimate: 


i((H, AIF. f) > ellfl e=eA) >0 
f € E(Ay)H [24] 


for the commutator of H with the generator of 
translations 


A= —i V (xj0; + Ojx;) 


! 


Here x; are coordinates of x € X*" in some orthonor- 
mal (with respect to scalar product [6]) basis in X*", A 
is neither a threshold nor an eigenvalue of the operator 
H and Aj is a sufficiently small interval. Very roughly 
speaking, the Mourre estimate [24] means that, 
similarly to the two-particle case, the observable 


(Ae y. riod d 


is a strictly increasing function of £ for all f € HY. 

The limiting-absorption principle implies that the 
singular continuous spectrum of the operator H is 
empty, but it is not sufficient for scattering theory. If 
the limiting-absorption principle were true for the 
critical value / — 1/2, then it would imply asymptotic 
completeness. Unfortunately, the operator (x) "^ is 
definitely not smooth even with respect to the free 
operator Ho. However, by introducing an auxiliary 
differential operator we can fix this problem. This 
leads to the radiation estimates. These estimates look 
differently in different regions of the configuration 
space. Choose any cluster decomposition a= 
(C1,...,C,). The radiation estimate morally implies 
that the motion of a system is asymptotically free in 
the variable x, (describing the relative motion of 
clusters) in the region where particles from each 
cluster Cj, 1— 1,...,", are close to each other 
compared to distances between different clusters. 
On the contrary, this motion is very complicated in 
the variable x^ pertaining to bound states of different 
clusters. In particular, the radiation estimate is the 
same as for the two-particle case in the “free” region 
where all particles are far from each other. 

To be more precise, let V, — Vx, be the gradient 
in the variable x, and let V+ 


a? 


(Vu) (x) = (Vau)(x) — |xa| ^(Vau)(x), Xa) Xa 


be its orthogonal projection in X, on the subspace 
orthogonal to the vector x,. Let x, be the 


characteristic function of a closed cone Y, C X*" 
satisfying the condition Y, O X; —0 for all b such 
that Xa Z Xp. Then the operator 


Ga = xalx) "^v 


is H-smooth on A. 

A proof of the radiation estimates is based on 
the consideration of the commutator of H with 
some differential operator M = —i 7 (m 9; + Ojm"”), 
where m” = 0m/Ox;. Here m (it depends on a) is a 
specially constructed function satisfying the follow- 
ing properties: 


1. m(x) is homogeneous (for |x| > 1) of order 1; 

2. for any b it does not depend on x^ in some 
conical neighborhood of the subspace X+; 

3. m(x) is convex; and 

4. m(x) = Ha|Xal, Ha > 1, on support of the function ya. 


Note that we can set ;z(x) — |x| in the case of the 
operator Ho. 

Due to properties (1) and (2) the commutator 
[V, M] is a short-range function (estimated by 
(x) for £ > 0). Due to properties (3) and (4) 
the commutator [Ho, M] > cG2G,,c > 0, up to 
short-range terms. The estimate 


[H, M] > cG*G, — e(x) t" 


implies that the operator G, is H-smooth on A. 

The main difficulty in the N-particle problem is 
that pair potentials V^(x^) do not tend to zero as 
Ix] — oc. The idea of the proof of asymptotic 
completeness is to introduce auxiliary wave opera- 
tors such that “effective” perturbations are decaying 
functions. This requires a suitable smooth partition 
of unity. Moreover, it is convenient to choose 
auxiliary identifications as first-order differential 
operators rather than operators of multiplication. 
Unfortunately, although such identifications allow 
one to “kill” directions where the potentials V^(x^) 
do not tend to zero, their commutators with the 
operator Ho have coefficients decaying at infinity 
only as |x|". 

Thus, we introduce differential operators 


M, — =i Y (m0, + am’? ) 
with coefficients m” = Om, /Óx;. The functions Ma 
satisfy properties (1), (2) formulated above and 


5. m,(x)=0 in some conical neighborhoods of the 
subspaces X, such that X, Xj. To put it 
differently, »7;,(x) — 0 in some conical neighbor- 
hood of the subspace where x; —x; for some 7,7 
belonging to different clusters C1, ..., Cn. 
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Let the operator H, be defined by eqn [9]. Given 
the limiting-absorption principle and the radiation 
estimates, we first check the existence of auxiliary 
wave operators 


W-(H, Ha; M;E4(A)) 
and 
W*(H,, H; MgE(A)) 25] 


Here we use that according to (5) coefficients of the 
differential operator (V — V^)M, are, under assump- 
tion [11], short-range (in the configuration space 
X*"). By property (2), the function [V^, M,] is also 
short-range. Thus, the operator VM, — Ma V° can be 
taken into account by the limiting-absorption 
principle. The commutator [Ho, Ma] factorizes into 
a product of H,- and H-smooth operators according 
to the radiation estimates. 

Similar arguments show that, for 55,71, =m and 
M=, M, (the sums here are taken over all 
possible breakups of the N-particle system), the 
wave operator (observable) 


W*(H, H; +ME(A)) [26] 


also exists. Moreover, it can be easily achieved 
that m(x) > 1. Then it follows from the Mourre 
estimate that operator [26] is positive definite 
on the subspace E(A)H and hence its range 
coincides with this subspace. It means that for 


all f € E(A)H 


„lim ||exp(-iHt)f — M exp(—iHt)g*||=0 [27] 


if f = W*(H,H;ME(A))g*. 
The existence of wave operators [25] implies that 
for any g^ = E(A)g* and g? = W*(H,, H; M,E(A))g* 


lim ||M exp(—iHt)g~ 
— X exp(-iH,t)g; || = 0 [28] 


Combining eqns [27] and [28], we see that 
exp(—iHt)f decomposes asymptotically into sim- 
pler evolutions exp (—iH,t)g>. This is one of the 
equivalent formulations of asymptotic complete- 
ness and leads to eqn [15]. 

Finally, we note that eqn [15] can be rewritten as 


exp(—iHt)f = V ^ exp(i®a (x, t))(2it) ^ 


A 


X f, (x4/(2t)) v^ (x") -- o(1) [29] 
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where ? — +00, d; = dim X,, f7 is the Fourier trans- 
form of f* and 


®,(xq,t) = x2(4t) | — Mt [30] 


Long-Range Interactions: New Channels 


The multiparticle problem acquires a long-range 
character if pair potentials decay as Coulomb 
potentials or slower. Similarly to the two-particle 
problem, for long-range potentials the definition of 
wave operators should be naturally modified. As in 
the short-range case, only the asymptotic complete- 
ness is a really difficult mathematical problem. 
Assume that pair potentials satisfy condition 


pi V¥3—1 


for all |x| € ko and sufficiently large Ko. Then only 
phase factors in eqn [29] should be modified. 
Actually, instead of eqn [30] we should set 


FV « C14 py, 


| 
Dax, 4) = x2(Ar) ! — Mt — Ji V. (sx. 0) ds 
0 


where V,(x) = V(x) — V*^(x) and as usual x = (xa, x°). 
As shown in Derezinski (1993), with this definition of 
wave operators, the asymptotic completeness holds. 
On the contrary, if pair potentials decay slower 
than |x| !?, then the traditional picture of scatter- 
ing breaks down (see Yafaev (1996)). Actually, a 
three-particle system might have additional scatter- 
ing channels intermediary between the channel 
where three particles are asymptotically free and 
the channels where a couple of particles form a 
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Introduction 


The existence of nuclear spin and its associated 
magnetism was first suggested by Wolfgang Pauli in 
1924, a conjecture based on the fine details of 
atomic spectra, the so-called hyperfine structure. 
The interaction of this nuclear magnetism with an 
external magnetic field was predicted to result in a 
finite number of discrete energy levels known as the 
Zeeman structure. However, the first direct 


bound state. In these additional channels, the 
bound state of a couple of particles depends on a 
position of the third particle, and it is destroyed 
asymptotically. 


See also: Quantum Mechanical Scattering Theory; 
Schródinger Operators. 
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excitation of transitions between nuclear Zeeman 
levels was by Isador Rabi in 1933, using radio- 
frequency (RF) waves in an atomic beam apparatus. 
In 1945, Felix Bloch and co-workers at Stanford, 
and Edward Purcell and co-workers at MIT, 
performed the first nuclear magnetic resonance 
(NMR) experiments in condensed matter, with the 
RF response of the hydrogen nucleus (proton) being 
directly detected. 

The early prospects for this new technique were 
limited to precise measurements of magnetic fields 
and nuclear magnetic moments. However, three 
transformational discoveries intervened to set 
NMR on a course that would result in initially 
unimaginable contributions to physics, chemistry, 


engineering, medicine, geology, food science, and 
biochemistry. In. 1950, it was found that atomic 
nuclei at different sites of a molecular orbital had 
slightly different resonant frequencies, a phenom- 
enon known as “chemical shift." In the a same year, 
Erwin Hahn discovered the spin echo, thus opening 
the possibility that multiple RF pulse trains could be 
used to remove unwanted nuclear spin interactions 
while being used to manipulate spin coherences with 
exquisite resolution. In addition, in 1951, using this 
spin echo, Herbert Gutowsky and Charles Slichter 
revealed a hitherto unobserved scalar spin-spin 
interaction between nuclei, mediated by the mole- 
cular orbital electrons. 

The discovery of the chemical shift and the scalar 
coupling would immediately revolutionize chemis- 
try. Further discoveries of nuclear quadrupole 
interactions and through-space dipolar interactions 
would add to the capacity of NMR to provide 
insight regarding structure and order in the solid and 
liquid crystalline state. But the spin echo would 
provide a platform for new advances in science in 
every one of the six decades following the discovery 
of NMR in 1945. These were successively diffusion 
and flow NMR, multidimensional NMR, magnetic 
resonance imaging, protein structure NMR, ex situ 
NMR, and quantum computing NMR. 


Resonant Excitation and Detection 


In quantum-mechanical language, the Zeeman 
Hamiltonian H for a nuclear spin experiencing a 
magnetic field Bo along the laboratory z-axis may be 
written as 


H =—yBol, (1] 


y being the (nuclear) gyromagnetic ratio while I, is the 
operator for the z-component of angular momentum, 
with eigenvalues mb, m lying in the range —1, —I + 
l,...,/. I is the angular momentum quantum 
number, being either integer or half-integer. From the 
Schrödinger equation, it can be seen that the eigenkets 
of H precess about the z-axis at a rate yBo, the 
frequency corresponding te the energy difference 
between the 2/ + 1 Zeeman levels. For convenience, 
we shall take the eigenvalues of I, to be simply m, 
dropping the factor 4, and leading to a Hamiltonian 
expressed in frequency rather than in energy units. 

Resonant excitation between the Zeeman levels is 
achieved by the application of an RF (w) magnetic 
field of amplitude 2B, linearly polarized normal to 
By such that the total Hamiltonian becomes 


H = —yBol, — 24D; cos wtl, [2] 
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This excitation is easily applied by means of a 
transversely oriented antenna coil, the same coil 
generally being used to detect the nuclear spin 
response. In the frame of reference rotating about 
Bo at w, the Hamiltonian transforms to 


T (Bo -*y. — «pl f, 
^Y 
— 7B, exp(i2wtl,)1, exp(—12utl;) [3] 


At resonance, w= wo = ^B. The last term in eqn [3] 
averages to zero and may be neglected (the 
Heisenberg condition) provided w œ yB,, that is, 
Bo >> Bı. Given Bo of the order of tesla and Bı of 
the order of millitesla, this condition is easily 
satisfied. Hence, from the perspective of the 
rotating frame, the spins at resonance see only the 
static magnetic interaction ^B,41,, so that applica- 
tion of this resonant RF field causes spins to nutate 
about the rotating frame x-axis at a rate ^ B4. Thus, 
by application of RF pulses of different duration, 
and phases, one may produce arbitrary reorienta- 
tion of the spins about various axes in the rotating 
frame. 

With the spin system disturbed from equilibrium, 
the NMR “signal” is detected via the subsequent 
free precession, and usually via the same antenna 
coil used for resonant excitation, Semiclassically, the 
phenomenon may be pictured as follows. RF 
excitation nutates an initial z-magnetization into 
the transverse plane of the rotating frame. Such 
transverse magnetization corresponds the laboratory 
frame to a magnetization precessing at the Larmor 
frequency, thus inducing an oscillating emf in the 
receiver coil. In the next section, we see how to 
describe this phenomenon in the language of 
quantum mechanics. 

Typically, NMR is performed using the nuclei of 
common atoms in organic molecules, (!H,?H, '?^C, 
ISN, "°F, ?! P) although for inorganic matter a wider 
class of nuclei are available. Of all these, the 
proton is most abundant and most sensitive, 
having the highest gyromagnetic ratio, », of all 
stable nuclei. 


The Quantum Statistics of the 
Spin Ensemble 


The nuclear Zeeman energy in typically available 
laboratory magnetic fields, yBoh, is many orders of 
magnitude smaller than the Boltzmann energy, kg T, 
except at millikelvin temperatures. At room tem- 
perature in thermal equilibrium, the fractional 
difference in populations between the Zeeman levels 
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is normally very small, for example, for protons, 
about 10?. Of course, the total number of spins 
available may be very large, for example, on the 
order of 10%. 

The signal in magnetic resonance is detected as a 
collective effect of the large ensemble of nuclear 
spins. The natural language of quantum statistics is 
that of the density matrix, p; the time-dependent 
expectation value for any observable represented 
by an operator O is then, tr(Op(t)), the diagonal 
sum of the product of O and p. The time evolution 
of the density matrix is given by the Liouville 
equation 


iz. = |, pl [4] 


where [,] is a commutator. For a constant Hamilto- 
nian, this equation gives 


p(t) = exp(iHt)p(0) exp-iHt) [5 


Physical solutions to the density matrix (Liouville 
space) are Alei (square) matrices formed in 
the (2I4-1)-dimensional angular momentum 
eigenbasis. Generally, we may write the density 
matrix in a representation of irreducible tensor 
operators. One very convenient representation is 
the set formed by taking products of spin 
operators. For example, in the case of spin-1/2 
where Liouville space is 2*-dimensional, we may 
write 


p(t) = 4I + aly + ayly + azl, [6] 


where I is the identity operator. The operators ly 
and I, provide the off-diagonal elements of p and 
define the degree of phase coherence in the 
ensemble, while the operator 1, defines the degree 
to which the diagonal elements differ, thus defining 
the polarization. a, and ay give the amount of “one- 
quantum coherence" in the ensemble while a, gives 
the polarization. In thermal equilibrium a, = a, — 0, 
and the spin ensemble exists in a state of 
pure longitudinal polarization given, in the high- 
temperature approximation, yBoh << kgT, by 


1 b B, 
Peqbm (9) BST gd + P 


(21 + 1) ; 7 


QI+DkpT ^ 
This is the starting point for all NMR experiments 
(Figure 1). 

Consider then the detection of precession via the 
Faraday induction. The size of the signal observed 
will be proportional to the size of the transverse 
magnetization M=tr[(I, + iy)p(t)| present in the 
rotating frame, this magnetization producing an 


/=2 m f=172 m 
EE — 
^y bB | 
— F^ = ———— —1/2 
yhBy 
oe E 1/2 
— 1 | 
2 


Figure 1 Schematic Zeeman levels for the case /=2 and 
| — 1/2. The bold lines indicate the relative population in each 
state in thermal equilibrium. 


induced emf with real and imaginary components 
because of the capacity of heterodyne receivers to 
detect quadrature phase. In the laboratory frame, 
the detected signal has a prefactor of yBo reflecting the 
Faraday induction, which, taken together with the 
dependence of the initial equilibrium magnetization on 
Bo, gives an overall NMR sensitivity (yBo)*, helping 
to explain in part why high magnetic fields are 
advantageous. Take the simple example for I= 1/2, 
where a single 90? resonant RF pulse is applied to the 
spin system, subsequent free precession occurring 
under the Zeeman Hamiltonian. The density matrix 
at detection is 


a y AX 
p(t) =exp(iwotl) exp (i= Ix) Peqbm (0) 


x exp (-i 5 Ix) exp(—iwotl,) 
~exp(iwotl,) exp (i z Ix) egal 


x exp (-i 51s) exp(—iwotl,) 
= exp(iwotlz)deghmly exp(—iwotl;) 


= Aeghmly COS(Wot) + degbmlx sin(wot) ^ [8] 


Noting tr(I2) -tr(I) = tr(I2) = (1/3)(21 + 1)I(I + 1) 
and tr(I1,15) — 0, the signal may easily be calculated 
as S(t) : deqhm eXp(iuot), corresponding, upon Fourier 
transformation, to a unique frequency at wo. Note 
that a basis consisting. of products of angular 
momentum operators are easy to handle since all 
evolution properties follow from the usual angular 
momentum commutation algebra. 

The spin echo pulse scheme of Figure 2 is one of 
the most important in NMR. It allows one to 
refocus dephasing effects caused by inhomoge- 
neous broadening, for example, due to the hetero- 
geneity of the magnetic field across the sample. 
Rewriting the density matrix equation in the 
rotating frame, replacing the Zeeman precession 
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Prot(9)_= aeqgbm l, 
e | fra (0), = eqom!y 


> Prot(t) i aeqbmly 


cos(Awgt) + 4,54! SIN(Awgt) 


t l 
T la: Pri C7). = eqbm l; cos(AwgT ) = Sbm sin(Awgr) 
————À— 
Sa. 


! Pro 7 D) = Begum! COS (Aug 7)coS(Augft) + a sl cos(Augr)sin(Augf) 
T +t 


a ~Aegpm!ySin(Awg7)cos(Awot) + aquel Sin (Aupr)sin(Aupf) 


v 


27 | Prot(2T) = aeqbmly 


eqbm 


Figure 2 Spin echo pulse scheme showing the evolution of the density matrix. 


by its residual offset, and accounting for both RF 
pulses, 


pr ( 27) = exp(iAworI;) exp(izly) exp(iAworl,) 


N . T 
x exp (i5 Ix) Peqbm (0) exp(-i7 Ix) 
x exp(—iAuoTI;) exp(—1iz1,) 
x exp(—1Aug71,) 


= Gegbm ly [9| 


Details of the density matrix evolution are given in 
Figure 2. The inversion pulse has the effect of 
completely reversing all the phase shifts that occur 
during the first interval, resulting in an echo signal 
when the two time periods are equal. Note the use 
of nested operators representing the successive 
influences of RF pulses (assumed to be ideal 
rotations) and Hamiltonian evolutions. The overall 
influence of the RF pulses is to render the effective 
Hamiltonian zero in this case. 

This echo sequence (and its equivalent multiple RF 
train, the Carr-Purcell-Meiboom-Gill sequence) allows 
one to remove the effect of magnetic field inhomo- 
geneities so as to investigate the underlying homoge- 
neous broadening and associated signal damping. 


Spin Relaxation 


The free precession of nuclear spins does not 
continue indefinitely. Ultimately the off-diagonal 
elements of the density matrix lose phase coherence 
while the diagonal elements gradually return to their 
thermal equilibrium state, two processes known, 
respectively, as T? (spin-spin) and Tı (spin-lattice) 


relaxation. The rate of relaxation. depends on 
interactions between the spins themselves and 
between the spins and their thermal environment. 
The process of T; relaxation requires fluctuations 
that induce transitions between the Zeeman levels. 
Clearly the relevant quantum-mechanical opera- 
tors must possess a nonzero matrix element 
coupling the Zeeman levels, and the frequency of 
those fluctuations must match the energy gap 
spacing. Predominant in causing such relaxation 
in diamagnetic environments are the internuclear 
dipolar interactions, while in paramagnetic envir- 
onments, dipolar interactions between nuclear and 
electronic spins are effective. One simple way of 
representing these processes is by the spectral 
density function, the Fourier power transform of 
their fluctuations, dipolar interactions causing 
spin-lattice relaxation due to fluctuations at wo 
and 2w 9. For a fluctuating interaction with correla- 
tion time, Te, that spectral density may approx- 
imate a Lorentzian of the form 


Tc 


N= 1342 


Thus, as the rate of molecular motions varies, due to 
the influence of temperature on tę, the Tı relaxation 
rate will be a maximum when wọTe = 1. Both solids 
(wot > 1) and liquids (uo7. « 1) have long Tı 
relaxation times while soft solids or complex liquids 
may have faster relaxation. Tı relaxation manifests 
as an exponential return to equilibrium values of 
longitudinal magnetization. Typical vales range 
from hundreds of milliseconds to hours, and the 
need to re-establish equilibrium between repetitions 
of the experiment can severely limit signal averaging 


[10] 
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and hence available signal-to-noise ratios. Note that 
T, relaxation occurs by stimulated emission. 
Spontaneous emission is effectively absent from 
nuclear spin systems owing to the long-radiation 
wavelength. 

The case of T> (spin-spin) relaxation is inherently 
more complex. First, the definition of “loss of phase 
coherence” depends on the particular RF pulse 
sequence employed. Second, the simple perturbation 
theory description applied to Tı relaxation only 
works in the fast motion limit, where the T» 
relaxation rate may be shown to depend on spectral 
density terms not only at wo and 2wọ but also w=0. 
In consequence, T» < Tı. T» relaxation is sensitive 
to static components. These static components may 
dominate in soft solids and solids. Indeed, any term 
in the Hamiltonian which spreads spin phases, and 
which cannot be recovered by means of a judicious 
RF pulse train, will contribute to T relaxation. 
Suppose the effective frequency distribution causing 
dephasing is described by an ensemble second 
moment <Auw*>, and exhibits fluctuations about a 
mean of zero with correlation time, Te. Then we may 
identify two limiting cases: in the slow motion limit 
«Aw >! 7. >> 1, the decay of the detected magne- 
tization is Gaussian, and given by a factor 
exp(—1/2 —«Aw^» t^) In solids, the proton T» 
relaxation may take place in a few tens of micro- 
seconds. In the fast motion limit < Au? >!/* 7. «& 1, 
the decay of the detected magnetization is exponential, 
and given by a factor exp( — «Au^ 7.t). Liquid state 
Tə values approach Tı under extreme narrowing 
conditions. 


The Details of the Nuclear 
Spin Hamiltonian 


Atomic nuclei interact with their environment, with 
surrounding electrons, and with other nuclear spins. 
It is precisely this feature that provides such a 
sensitive probe of material structure and dynamics. 
For a material immersed in a steady magnetic field 
Bo along the laboratory z-axis, the Hamiltonian for 
the ith nuclear spin can be written 


H =—7Boljz — Ij.5.Bo + X_ JL; 
j 


t LDI + 0-1; [11] 
j 


It is the variety of the terms in the nuclear spin 
Hamiltonian that imparts power to NMR. The 
first is the nuclear Zeeman interaction with the 
applied magnetic field. In modern laboratory 


superconducting magnets, this interaction can be 
as large as 1000 MHz, although in earth field 
applications it can be as small as 2.5 kHz. Given that 
the sensitivity and resolution of NMR generally 
improve with increasing magnetic field, the range of 
100-1000 MHz is typically the operating regime of 
choice. All other terms in the nuclear spin Hamiltonian 
are smaller and thus act as first-order perturbations 
only, projecting their quantum operators into the 
zeroth-order Zeeman eigenbasis, the quantum frame 
of the operator I,. Because several of the terms in 
H depend on the orientation of the local nuclear 
environment (e.g., the molecular orbital) with respect 
to the magnetic field, these terms will fluctuate in the 
presence of reorientational motions. By the Heisenberg 
uncertainty principle, fluctuations faster in frequency 
than the size of the Hamiltonian contribution, 
expressed in frequency units, will result in an averaging 
to the mean, a phenomenon known as “motional 
averaging." 

The term —I;.S.Bo is the chemical shift that occurs 
for nuclei in molecular atoms, or the knight shift for 
nuclei in metals. It is typically a few ppm to several 
100ppm (ie. 100’s Hz to 10kHz), depending 
on the nucleus. S —^ae is a tensor whose principal 
axes (1, 2, 3) are associated with the local symmetry 
axis of the molecular orbital (bond) in the vicinity 
of the nucleus. For a liquid state molecule tumbling 
rapidly and isotropically, only the averaged trace 
of o,0;=(1/3)(o1, + 022 + 033) survives under 
motional averaging, giving a fixed frequency shift 
—o;yBol;-. However, in a solid-state environment, 
the remaining terms also contribute to the aniso- 
tropic chemical shift 


Hes = —07yBoliz — 4 (3.cos* B — 1) 


x (033 — 0;)yBoliz [12] 


where 8 is the polar angle between the magnetic 
field and the principal axis (the axis “3”). 

The scalar coupling term, 5;JI;.l; causes each 
(ith spin) energy level to be sensitive to the quantum 
states of the neighboring j-spins, the coupling 
constant J being typically tens to hundreds of hertz 
for nearby spins, but reducing rapidly with greater 
distance in the molecular orbital. Note that the 
operator 5 ;; JI;.I; is nondiagonal in the zeroth-order 
representation, but provided that the chemical shift 
between the / and j spins is larger than the coupling 
frequency (known in chemistry as an AX spin 
system), the operator reduces to 5 ;; Jlizljz the effect 
being to split the ;-spin resonance in to a multiplet, 
depending on the state of the nearby j-spin. For m 
identical nearby j-spins, the multiplet bears a simple 


binomial relationship to m, allowing one to “read” 
this number directly. The combination of chemical 
shift and scalar coupling information is of profound 
importance in identifying molecular structure in 
chemistry. 

The terms 2 I;.D.I; and I;.O.I; are, respectively, 
the through-space dipolar interaction, Hp, and the 
nuclear quadrupole interaction, Ho, the latter being 
nonzero only for nuclear spin quantum numbers I > 
1/2, for example, ^H. These interactions, projected 
into the zeroth-order Zeeman frame, for the dipole- 
dipole interaction, are 


_ HOP > alu ace 9. 
Hp = ra 2 D ;u 3 cos” 6j) 


X (ieli - El [13] 


where r; is the internuclear distance and 4; is the 
angle made by the internuclear vector with the 
magnetic field direction; while, for the quadrupole 
interaction 


ME 3eVzzO E! 
S- MQI — 1)52 


x (31$ — I(I —- 1)) [14] 


(1 — 3cos0zz) 


where O is the nuclear quadrupole moment, Vzz is 
the electric field gradient (assuming axial symmetry) 
and @zz is the angle made by the principal axis of 
that gradient with the magnetic field direction. For 
protons in organic matter, the internuclear dipole 
interaction strength is on the order of 100 kHz, a 
similar strength being found for the quadrupole 
interaction of deuterons. However, in the liquid 
state, these orientation-dependent interactions fluc- 
tuate so rapidly that they are typically motionally 
averaged to zero. Nonetheless, their fluctuations do 
contribute to the relaxation process. 

Liquid-state NMR can result in exceptionally 
high-resolution (sub-Hz) spectra, if care is taken to 
adjust the magnetic field harmonics (shims) to 
produce a highly uniform Zeeman field across the 
sample. The last contribution of residual inhomo- 
geneities to line broadening can often be removed by 
gently spinning the sample about its axis at a rate of 
a few tens of hertz. 


The Evolution Domain, Multiple RF 
Pulses, and Multidimensional NMR 


Having seen the complexity of the spin Hamilto- 
nian, one may envisage experiments where the spin 
coherences evolve in a much more complicated 
manner. To this end, consider the case of a 
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molecular liquid two-spin (AX) system coupled via 
the scalar spin-spin interaction. In first-order per- 
turbation theory, we may represent the simple two- 
spin Hamiltonian (in the rotating frame of the 
averaged Larmor frequency) as 


Hro = —01YBohhz — 02YBol2z + J lil; 
= —wili — w212; + J Mizl2z [15] 
We now write down the density matrix in 


the rotating frame following a single 907 RF 
pulse (Ix), 


p(t) = exp(1w th, +iwztlh, + iJ Tizl2zt) 


x exp( in lx )deghm (It + D) exp( i51.) 


x exp(—iw; th, — iwztl2; — iJ lizl2zt) 
— exp(iui th, + iw2th, + iJ Izlozt)degbm (ly + I2y) 
x exp(—iu1tI1; — iw2tl2, — iJ 11 z12,t) 
= exp (iw tly, + iw2tlrz )aeghm 
x ((y + Dy) cos($Jt) + 2(hizlox + lixlaz) 
x sin(5Jt)) exp(—iw) tl), — iw2tl2;) 
(1) cosuit + D coswyt 
+I}, sinu + Dy sinw2t) cos(5Jt) 


= Aeqbm +2 (I1zI2x coswyt — Ij;I5,sinwot [16] 


+ I1; cosu1t— II; sinut) 
x sin(5Jt) 


Detection in the rotating frame with I, +ily gives a 
signal 


S(t) ~ deqgbm(exp(iwit) + exp(iw2t)) cos(+Jt) [17] 


Fourier transformation with respect to 7 yields a 
spectrum corresponding to two spectral lines at wy 
and w2, each split into a doublet of two sidebands 
separated by J. 

Notice that it is easier to follow the evolution of 
the density matrix by simply writing down a time 
sequence of behaviors under the influence of the 
successive Hamiltonians. Where simultaneous terms 
in the Hamiltonians commute, the order of their 
operation may be set at will. Thus, the above 
example becomes 


il x "n 
fe Tg N Lig (y Top con (dJi) 


+ 2(IzL + I1) sin(3Jt) 


un tli utl. 
(ly coswit + D5, coswt 


Fili, sino £ + ilo, sinwyt) cos(3Jt) 
+2 (Ti lox COS Wt — VEM PM sinu»t 


Hay; cosuit —iliI5; sinwit) sin(3Jt) [18] 
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Figure 3 The proton NMR spectrum of ethanol showing three major peaks, separated by chemical shift, each split into multiplets 


arising from nearby protons via the scalar coupling. 


Now consider a two RF pulse scheme as shown in 
Figure 4, each RF pulse being 90°. The evolution is 


le + be Hs [jy + I5 Shete le-a 
(I1y COS Wty + D», cosw zt + lx sin wt: 
+ D sin wt) cos(3 Jt) + 22h, cos w2t 
—Iizhy sinw2t) + I3415; cos wit} 
—ItyI; sin w1 tı) sin (4 Jti) 


zl 
2x ‘ 
“(hy cos w1t1 — In, coswzt1 + 1, sin wit 


+ D, sin wt ) cos (3 Jti) 
+ 2 (Liylox COS Wty + 15; sin Wt} 


+ Ixly cos wy ty + lizzy sin wt) sin (5 Jti) 
witoligwat2lz-]Til:t 
? 
Keeping only observable magnetization 


(Tix sin w1tı COS wt» + Ix sin wt; cos wta) 

x cos(5/t1) cos(4Jt2) + (I1 sin wt sin w1t2 

+ Dy sinwiti sin w2t2) x sin(4 Jti) sin (4Jt2) [19] 
If the idealized experiment is performed with two 


independent time dimensions £4 and ft), then detec- 
tion in the rotating frame over the t) period with 


I. + ily gives a signal (restricting our attention to the 
quadrant of positive frequencies) 


S(t1, t2) ^ deqbm (exp(iwiti) exp(iwi£2) + exp(iw»ti) 
x exp(iu»t?)) cos(5Jt1) cos(}Jt2) 
+ degbm(€XP(iw2ty ) exp(iwit2) 
+ exp(1w1t1) exp(1u5t2)) 
x sin(5/t1) sin (4Jt2) [20] 


When Fourier transformed in two dimensions with 
respect to t; and tz, the pattern shown in Figure 5 
results. Remarkably, while the diagonal spectrum 
is the same pair of doublets seen in the figure, 
this two-dimensional spectrum contains off-diagonal 
antiphase peaks for scalar-coupled sites where magnet- 
ization transfer has occurred. 

The idea of performing NMR in two or more 
dimensions was first proposed by Jean Jeener in 
1971. The example outlined above, correlation 
spectroscopy (COSY), is just one of an array of 
coherence transfer experiments using multiple RF 
pulse trains and time domain evolution of the spin 
ensemble. Notice that in the COSY experiment, tı is 
an evolution dimension during which no detection of 
NMR signal occurs, while £; is the detection domain. 


Hrot = =u] hz— Wolo, + Shy loz 


Figure 4 RF pulse scheme used for COSY experiment. 


Figure 5 Schematic COSY (modulus) spectrum for an AX spin 
system. Not that the (antiphase) off-diagonal peaks indicate 
J-couplings between chemical-shift-separated spins. 


The effect of the evolution is indelibly imprinted in 
the spin system density matrix allowing later recall of 
vital information concerning the interactions present 
in the spin Hamiltonian. The COSY experiment 
allows one to determine which spins are coupled via 
their molecular orbital electrons. Other multidimen- 
sional methods that rely on dipole-dipole relaxation 
effects, such as NOESY, determine which spin sites 
have “through-space” proximity. 

The use of two- and higher-dimensional methods 
has allowed the NMR spectra of biological macro- 
molecules to be unraveled, with COSY methods used 
for spectral assignment of amino acid units, and 
NOESY methods used to determine any close proxi- 
mities of amino acids otherwise well separated in the 
sequence. Such distance information has allowed the 
reconstruction of protein conformations by NMR. 

The second RF pulse of Figure 4 also generates a 
state of the density matrix, /1,/2, known as a double 
quantum coherence, and, in the simple COSY 
experiment, lost to observable magnetization. Other 
RF pulse schemes can take advantage of this state, 
converting it via suitable *coherence pathways" into 
an observable. For a detailed summary of these 
various NMR phenomena, readers are referred to the 
book by Ernst et al. (1987). 
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Solid-State NMR 


As with / couplings, dipolar interactions and 
quadrupole interactions (I 1/2) are bilinear in 
the spin operators and can be used to generate 
various higher-order coherence pathways in NMR 
experiments. Unlike the simple spin-spin coupling, 
they have an angular dependence. In solids, these 
interactions may broaden the NMR resonance line 
by tens to hundreds of kilohertz. In the case where a 
probe nucleus is located at a known site in the 
material (often achieved by deuteron labeling), these 
Hamiltonian terms may contribute important infor- 
mation about structure, and especially orientational 
anisotropy. For example, the quadrupole interaction 
for the spin-1 deuteron (see eqns [11] and [14]) 
depends as P5(cos677) = (1/2)(3 cos 6zz — 1) on the 
angle between the external magnetic field and 
the electric field gradient (generally associated with 
the local molecular orbital or bond direction, and taken 
here to be axially symmetric). Note that the first-order 
contribution of the quadrupole interaction leads to an 
unequal separation of the m=1,0,—1 Zeeman 
energy levels, resulting in a doublet NMR spectrum, 
for any particular orientation, 077. Such a unique 
orientation might be found in a single crystal, or in a 
nematic liquid crystalline state. For a polycrystalline 
material, however, the NMR spectrum has a con- 
tribution from all orientations, leading to a character- 
istic powder pattern. The details of ^H spectral 
distributions may be used to characterize the degree 
of orientational order in solids and soft, anisotropic 
matter. 

For !H,?C, and other spin-1/2 nuclei, dipolar 
interactions (with a wide distribution of spin 
spacings and internuclear vector orientation) may 
severely broaden the NMR spectrum in the solid 
state (see eqns [11] and [13]). Such interactions, 
along with quadrupole interactions for nuclei with 
I >1/2, may be significantly reduced by modulating 
the effective dipolar Hamiltonian at a rate faster 
than its strength in frequency units. Two methods 
are available, one (magic angle spinning or MAS) 
relying on the angular terms in eqns [13] and [14], 
and the other (multiple pulse line narrowing) on the 
spin terms. The MAS technique relies on spinning 
the sample rapidly about at angle oriented at 54.4? 
to the magnetic field, such that the average value of 
P(cos 0;) becomes its projection along this spinning 
axis, while the projection of the spinning axis 
residual is P5(cos 54.4?) ~ 0. Multiple pulse meth- 
ods rely on a successive reorientation of the spin 
system such that the effective dipolar Hamiltonian 
that results from the application of the nested 
evolution operators is rendered close to zero. 
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In practice, MAS techniques work best with !^C 
NMR where the moderate ! H-'^C dipolar interactions 
may be removed with achievable spinning speeds (a 
few tens of kilohertz). Furthermore, the larger proton 
magnetization (^5; /^yisc % 4) can be transferred to the 
5C nuclei via Hartman-Hahn cross-polarization thus 
significantly enhancing sensitivity. Such methodology 
is referred to as CPMAS NMR. 

The real art of solid-state NMR is in removing the 
unwanted dipolar or quadrupolar interactions, but 
leaving specific interactions of interest. This may be 
achieved by including in the MAS experiment, 
specific combinations of pulses which recouple 
selected spins. Some of the most sophisticated 
experiments in modern NMR are to be found in 
this domain of application. 


Conclusion 


NMR provides exceptional structural information 
concerning molecules, biomolecules as well as 
molecular assemblies, liquid crystals, soft solids, 
and solids. In addition, the method provides unique 
information concerning molecular dynamics, 
through both relaxation methods and the direct 
measurement of diffusion or flow. One spectacular 
application of NMR concerns its use in imaging, 
achieved by giving the Larmor frequency a spatial 
tag through the use of deliberately inhomogeneous 
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Several fields of mathematics have closely been 
associated to physics: this has always been the case 
for the theory of differential equations. In the early 
twentieth century, with the advent of general 
relativity and quantum mechanics, topics such as 
differential and Riemannian geometry, operator 
algebras, and functional analysis, or group theory 
also developed a close relation to physics. In the 
1990s, mostly through the influence of string theory, 
algebraic geometry also began to play a major role 
in this interaction. Recent years have seen an 
increasing number of results suggesting that number 
theory also is beginning to play an essential part on 
the scene of contemporary theoretical and mathe- 
matical physics. Conversely, ideas from physics, 


magnetic fields. This topic is covered in the article 
on Magnetic Resonance Imaging. 


See also: Magnetic Resonance Imaging. 
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mostly from quantum field theory and string theory, 
have started to influence work in number theory. 

In describing significant occurrences of number 
theory in physics, we will, on the one hand, restrict 
our attention to quantum physics, while, on the other, 
we will assume a somewhat extensive definition of 
number theory that will allow us to include arithmetic 
algebraic geometry. The territory is vast and an 
extensive treatment would go beyond the size limits 
imposed by the encyclopedia. The choice of topics 
represented here inevitably reflects the limited knowl- 
edge, particular interests, and bias of the author. Very 
useful references, collecting a lot of material on number 
theory and physics, are the proceedings of the Les 
Houches conference in 2003 (Beilinson and Manin 
1986), as well as the two volumes of a previous Les 
Houches conference on number theory and physics, 
which took place in 1989, published by Springer in 
1990 and 1992. A number theory and physics database 
is presently maintained online by M R Watkins. 


In the following, we have organized the material 
by topics in number theory that have so far made an 
appearance in physics, and for each we briefly 
describe the relevant context and results. This 
singles out many themes. We first discuss a class of 
functions that occur in physics and their special 
values that are of great number-theoretic impor- 
tance. This includes the dilogarithm, the polyloga- 
rithms and multiple polylogarithms, and the 
multiple zeta values. We also discuss the most 
important symmetry groups of number theory, the 
Galois groups, and occurrences in physics of some 
forms of Galois theory. We then discuss how 
techniques from the arithmetic geometry of alge- 
braic varieties, especially Arakelov geometry, play a 
role in string theory. Finally, we discuss briefly the 
theory of motives and outline its possible relation to 
quantum physics. From the physics point of view, it 
seems that the most promising directions in which 
number-theoretic tools have come to play a crucial 
role are to be found mostly in the realm of rational 
conformal field theories and of noncommutative 
geometry, as well as in certain aspects of string 
theory. 

Among the topics that are very relevant to this 
theme, but that will not be touched upon in this 
article, there are important subjects such as the 
theory of “arithmetic quantum chaos,” the use of 
methods of random matrix theory applied to the 
study of zeros of zeta functions, or mirror symmetry 
and its connection to modular forms. The interested 
reader can find such topics treated in other articles 
of this encyclopedia and in the references mentioned 
above (see Quantum Ergodicity and Mixing of 
Eigenfunctions; Random Matrix Theory in Physics; 
Mirror Symmetry: a Geometric Survey). 


Dilogarithm, Multiple Polylogarithms, 
Multiple Zeta Values 


The dilogarithm is defined as 


It satisfies the functional equation Li»(z) + Li5(1 — z) = 
Li»(1) — log (z) log (1 — z), where Li(1) = (2), for C(s) 
the Riemann zeta function. A variant is given by 
the Rogers dilogarithm L(x) — Li;(x) + (1/2) log (x) 
log(1— x). For more details, see Zagiers paper 
(Julia et al. 2005, vol. II). 

The polylogarithms are similarly defined by the 
series Lij(z) — Y 54 € fn. In quantum electrody- 
namics, there are corrections to the value of the 
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gyromagnetic ratio, in powers of the fine structure 
constant. The correction terms that are known 
exactly involve special values of the zeta function 
such as ((3), (5) and values of polylogarithms such 
as Li4(1/2). The series defining the polylogarithm 
function Lá(z)— ».5,.,z"/m converges absolutely 
for all s € C and |z| < 1 and has analytic continua- 
tion to zc€CV[1,oo9). The Fermi-Dirac and 
Bose-Einstein distributions are expressed in terms 
of the polylogarithm function as 


SO x 
J = dx = -T (s + 1)Li (t e) 


The multiple polylogarithms are functions defined 
by the expressions 
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By analytic continuation, the functions 
Li;,....(21,22,...,2;) are defined for all complex s; 
and for z; in the complement of the cut [1, oc) in the 
complex plane. Multiple zeta values of weight k and 
depth r are given by the expressions 


thu) T r H 
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with k; € N and kı > 2. These satisfy many combi- 
natorial identities and nontrivial relations over Q. 
For an informative overview on the subject, see 
Cartier (2002). Notice that, for the sums in [1] and 
[2], a different summation convention can also be 
found in the literature. 


Conformal Field Theories and the Dilogarithm 


There is a relation between the torsion elements in 
the algebraic K-theory group K3(C) and rational 
conformally invariant quantum field theories in two 
dimensions (see Nahm (2005)). There is, in fact, a 
map, given by the dilogarithm, from torsion 
elements in the Bloch group (closely related to the 
algebraic K-theory) to the central charges and 
scaling dimensions of the conformal field theories. 

This correspondence arises by considering sums of 
the form 


O(m) 
y q 


3 

meN' (a), | | 
where (d) = (FD) an, Ni (d), (q),. = (1 > q)(1 0 q^) P 
(1 — 9") and O(m) = m'Am/2 + bm + b has rational 
coefficients. Such sums are naturally obtained from 
considerations involving the partition function of a 
bosonic rational conformal field theory (CFT). In 
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particular, [3] can define a modular function only if all 
the solutions of the equation 


> A; log(x;) = log(1 — x;) [4] 
j 


determine elements of finite order in an extension 
B(C) of the Bloch group, which accounts for the fact 
that the logarithm is multivalued. The Rogers 
dilogarithm gives a natural group homomorphism 
(2xi L: B(C) > C/Z, which takes values in Q/Z 
on the torsion elements. These values give the 
conformal dimensions of the fields in the theory. 


Feynman Graphs 


Multiple zeta values appear in perturbative quantum 
field theory. D Kreimer (2000) developed a connec- 
tion between knot theory and a class of transcen- 
dental numbers, such as multiple zeta values, 
obtained by quantum field-theoretic calculations as 
counterterms generated by corresponding Feynman 
graphs. Broadhurst and Kreimer (1997) identified 
Feynman diagrams with up to nine loops whose 
corresponding counterterms give multiple zeta 
values up to weight 15. Recently, Kreimer showed 
some deep analogies between residues of quantum 
fields and variations of mixed Hodge-Tate struc- 
tures associated to polylogarithms. 

Testing predictions about the standard model of 
elementary particles, in the hope of detecting new 
physics, requires developing effective computational 
methods handling the huge number of terms involved 
in any such calculation, that is, efficient algorithms for 
the expansion of higher transcendental functions to a 
very high order. The interesting fact is that abstract 
number-theoretic objects, such as multiple zeta values 
and multiple polylogarithms, appear naturally in this 
context (cf., e.g., Moch et al. (2002)). The explicit 
recursive algorithms are based on Hopf algebras and 
produce expansions of nested finite or infinite sums 
involving ratios of gamma functions and Z-sums 
(Euler—Zagier sums), which naturally generalize multi- 
ple polylogarithms and multiple zeta values. Such 
sums typically arise in the calculation of multiscale 
multiloop integrals. The algorithms are designed to 
recursively reduce the Z-sums involved to simpler ones 
with lower weight or depth. 


Galois Theory 


Given a number field K, which is an algebraic 
extension of Q of some degree [K : Q] =n, there is 
an associated fundamental symmetry group, given 
by the absolute Galois group Gal(K/K), where K is 
an algebraic closure of K. Even in the case of Q, the 


absolute Galois group Gal(Q/Q) is a very compli- 
cated object, far from being fully understood. 

One can consider an easier symmetry group, 
which is the abelianization of the absolute Galois 
group. This corresponds to considering the field K”, 
the *maximal abelian extension" of K, which has 
the property that 


Gal(K% /K) = Gal(K/K)^ 


The Kronecker-Weber theorem shows that for 
K=Q the maximal abelian extension can be 
identified with the cyclotomic field (generated by 
all roots of unity), Q^ =O", and the Galois 
group is identified with Gal(Q” /Q) ~ 7°, where 
Z' = A/Q} . In general, for other number fields, 
one has the “class field theory isomorphism” 


0 : Gal(K” /K) Cx / D 


where Cy = A;./K" is the group of idele classes and 
Dx the connected component of the identity in Cx. In 
general, however, one does not have an explicit 
description of the generators of the maximal abelian 
extension K^ and the action of the Galois group. This 
is the content of the explicit class field theory problem, 
Hilbert's 12th problem. In addition to the Kronecker- 
Weber case, a complete answer is known in the case of 
imaginary quadratic fields K — O(V/—d), with d > 1a 
positive integer. In this case generators are obtained by 
evaluating modular functions at a point 7 in the 
upper-half plane such that K — Q(7) and the Galois 
action is described explicitly through the group of 
automorphisms of the modular field, through Shimura 
reciprocity. For a survey of the explicit class field 
theory problem and the case of imaginary quadratic 
fields, see Stevenhagen (2001). 

As we mentioned above, understanding the 
structure of the absolute Galois group Gal(Q/Q) is 
a fundamental question in number theory. Grothen- 
dieck described, in his famous proposal “Esquisse 
d'un programme," how to obtain an action of 
Gal(Q/Q) on an essentially combinatorial object, 
the set of “dessins d'enfants." These are connected 
graphs (on a surface) such that the complement of 
the graph is a union of open cells and the vertices 
have two different markings, with the properties 
that adjacent vertices have opposite markings. Such 
objects arise by considering the projective line P! 
minus three points. Any finite cover of P! branched 
only over (0, 1,06] gives an algebraic curve defined 
over Q. The dessin is the inverse image under the 
covering map of the segment [0,1] in P'. The 
absolute Galois group Gal(Q/Q) acts on the data of 
the curve and the covering map, hence on the set of 
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dessins. A theorem of Bielyi shows that, in fact, all 
algebraic curves defined over Q are obtained as 
coverings of the projective line ramified only over 
the points (0, 1, oc]. This has the effect of realizing 
the absolute Galois group as a subgroup of outer 
automorphisms of the profinite fundamental group 
of the projective line minus three points. For a 
general reference on the subject, see Schneps (1994). 

A different type of Galois symmetry of great 
arithmetic significance is “motivic” Galois theory. 
This will be discussed later in the section dedicated 
to motives, where we discuss a surprising occurrence 
in the context of perturbative quantum field theory 
and renormalization. 


Quantum Statistical Mechanics and Class 
Field Theory 


In quantum statistical mechanics, one considers an 
algebra of observables, which is a unital C*-algebra 
A with a time evolution o;. States are given by linear 
functionals p: A — C satisfying (1) — 1 and posi- 
tivity y(x*x) 2 0. Equilibrium states y at inverse 
temperature / satisfy the Kubo-Martin-Schwinger 
(KMS) condition, namely, for all x,y €.A there 
exists a bounded holomorphic function F, ,(z) on 
the strip 0 < S(z) < 8, which extends continuously 
to the boundary, such that for all t € R 


Fy y(t) = p(xor(y)) 
and 
Fy y(t + 18) = plor(y)x) [5] 


Cases of number-theoretic interest arise when one 
considers the noncommutative space of commensur- 
ability classes of Q-lattices up to scaling as algebra of 
observables, with a natural time evolution determined 
by the covolume, as shown in the paper Quantum 
Statistical Mechanics of O-Lattices of Connes-Marcolli 
(Julia et al. 2005, vol. I). A O-lattice in R" consists of a 
pair (Aj) of a lattice A C R” together with a 
homomorphism of abelian groups oó:Q"/Z" — 
QA/A. Two Q-lattices are commensurable, (A), 61) ~ 
(A>, 03), iff OA, = QA» and d; = ó? mod A4 + Ad. 


The Bost-Connes system The quantum statistical 
mechanical system considered by Bost and Connes 
(1995) corresponds to the case of one-dimensional 
Q-lattices. The partition function of the system is 
the Riemann zeta function ¢(3). The system has 
spontaneous symmetry breaking at 9 = 1, with a 
single KMS state for all 0 < 8 € 1. For 8 > 1, the 
extremal equilibrium states are parametrized by the 
embeddings of Q° in C with a free transitive 
action of the idele class group Co/Do = 7. At zero 


temperature, the evaluation of KMS, states on 
elements of a rational subalgebra intertwines the 
action of 7" by automorphisms of (A,o;) with the 
action of Gal(Q?^^ / Q) on the values of the states. 
This recovers the explicit class field theory of Q 
from a physical perspective. 


Noncommutative space of adele classes The algebra 
A of the Bost-Connes system is the noncommutative 
algebra of functions f(r,p), for p € Z and re Q* 
such that rp € Z, with the convolution product 


fi * fo(r,p) = 3 fi(rs  .sp)h(s.p) (6| 


scQ" :spczZ, 


and the adjoint f" (r, p) —- f (r^! , rp). According to the 
general philosophy of Connes style noncommutative 
geometry, it is the algebra of coordinates of the 
noncommutative space defined by the “bad quoti- 
ent” GL;(Q)V(A,; x {£1}) - a noncommutative 
version of the zero-dimensional Shimura variety 
Sh(GL,,{+1})= GL, (Q )\ (GL; (Ap) X FET): ). Its *dual 
system" (in the sense of Connes's duality of type III 
and type II factors) is obtained by taking the crossed 
product by the time evolution. It gives the algebra of 
coordinates of the noncommutative space defined by 
the quotient A/Q*. This is the noncommutative 
space of “adele classes" used by Connes in his 
spectral realization of the zeros of the Riemann zeta 
function. 


The GL»o-system A generalization of the Bost- 
Connes system was introduced by Connes and 
Marcolli in the paper Quantum Statistical 
Mechanics of Q-Lattices (Julia et al. 2005). This 
corresponds to the case of two-dimensional 
Q-lattices. The partition function is the product 
C(B)C(B — 1). The system in this case has two phase 
transitions, with no KMS states for 3 < 1. For 3 > 2, 
the extremal KMS states are parametrized by the 
invertible O-lattices, namely, those for which ó is an 
isomorphism. The algebra .A has an arithmetic 
structure given by a rational algebra of unbounded 
multipliers. This rational algebra contains modular 
functions and Hecke operators. At zero temperature, 
extremal KMS states can be evaluated on these 
multipliers. Symmetries of (A, g+) are realized in part 
by endomorphisms (as in the theory of superselec- 
tion sectors) and the symmetry group acting on 
low-temperature KMS states is the group of auto- 
morphisms of the modular field GL?(A;)/O". For a 
generic set of extremal KMS... states, evaluation at 
the rational algebra intertwines this action with the 
action on the values of an embedding of the modular 


field as a subfield of C. 
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The complex multiplication system In the case of an 
imaginary quadratic field K = O(r), an analogous 
construction is possible. A one-dimensional K-lattice is 
a pair (A, $) of a finitely generated O-submodule A of 
C, with AK = K, and a homomorphism of O-modules 
@:K/O — KA/A. Two K-lattices are commensurable 
iff KA, = KA» and 6; = $? mod A, + A2. Connes et al. 
(Preprint 2005) constructed a quantum statistical 
mechanical system describing the noncommutative 
space of commensurability classes of one-dimensional 
K-lattices up to scale. The partition function is the 
Dedekind zeta function Qx (8). The system has a phase 
transition at 8 = 1 with a unique KMS state for higher 
temperatures and extremal KMS states parametrized by 
the invertible K-lattices at lower temperatures. There is 
a rational subalgebra induced by the rational structure 
of the GL2-system (one-dimensional K-lattices are also 
two-dimensional Q-lattices with compatible notions of 
commensurability). The symmetries of the system are 
given by the idele class group Aj ,/K". The action is 
partly realized by endomorphisms corresponding to the 
possible presence of a nontrivial class group (for class 
number » 1). The values of extremal KMS... states on 
the rational subalgebra intertwine the action of the idele 
class group with the Galois action on the values. This 
fully recovers the explicit class field theory for 
imaginary quadratic fields. 


Conformal Field Theory and the Absolute 
Galois Group 


Moore and Seiberg considered data associated to any 
rational conformal field theory, consisting of matrices, 
obtained as monodromies of some holomorphic multi- 
valued functions on the relevant moduli spaces, 
satisfying polynomial equations. Under reasonable 
hypotheses, the coefficients of the Moore-Seiberg 
matrices are algebraic numbers. This allows for the 
presence of interesting arithmetic phenomena. Through 
the Chern-Simons/Wess-Zumino-Witten correspon- 
dence, it is possible to construct three-dimensional 
topological field theories from solutions to the Moore- 
Seiberg equations. 

On the arithmetic side, Grothendieck proposed in 
his “Esquisse d’un programme” the existence of a 
Teichmüller tower given by the moduli spaces Mg n 
of Riemann surfaces of arbitrary genus g and number 
of marked points n, with maps defined by operations 
such as cutting and pasting of surfaces and forgetting 
marked points, all encoded in a family of funda- 
mental groupoids. He conjectured that the whole 
tower can be reconstructed from the first two levels, 
providing, respectively, generators and relations. He 
called this a “game of Lego-Teichmüller." He also 
conjectured that the absolute Galois group acts by 


outer automorphisms on the profinite completion of 
the tower. The basic building blocks of the tower are 
provided by *pairs of pants," that is, by projective 
lines minus three points. 

This leads to a conjectural relation between the 
Moore-Seiberg equations and this Grothendieck- 
Teichmüller setting (cf. Degiovanni 1994) according 
to which solutions of the Moore-Seiberg equations 
provide projective representations of the Teichmüller 
tower, and the action of the absolute Galois group 
Gal(Q/Q) corresponds to the action on the coeffi- 
cients of the Moore-Seiberg matrices. 

Rational conformal field theories are, in general, 
one of the most promising sources of interactions 
between number theory and physics, involving 
interesting Galois actions, modular forms, Brauer 
groups, and complex multiplication. Some funda- 
mental work in this direction was done by, for 
example, Borcherds and Gannon. 


Arithmetic Algebraic Geometry 


In this section we describe occurrences in physics of 
various aspects of the arithmetic geometry of 
algebraic varieties. 


Arithmetic Calabi-Yau 


In the context of type II string theory, compactified 
on Calabi-Yau 3-folds, Greg Moore considered 
certain black hole solutions and a resulting dynami- 
cal system given by a differential equation in the 
corresponding moduli. The fixed points of these 
equations determine certain “black hole attractor 
varieties.” In the case of varieties obtained from a 
product of elliptic curves or of a K3 surface and an 
elliptic curve, the attractor equation singles out 
an arithmetic property: the elliptic curves have 
complex multiplication. The class number of the 
corresponding imaginary quadratic field counts 
U-duality classes of black holes with the same area. 
Other results point to a relation between the 
arithmetic properties of Calabi-Yau 3-folds and 
conformal field theory. For instance, it was shown 
by Schimmrigk that, in certain cases, the algebraic 
number field defined via the fusion rules of a 
conformal field theory as the field defined by the 
eigenvalues of the integer-valued fusion matrices 


bi * Qi = (Ni) Or 


can be recovered from the Hasse—Weil L-function of 
the Calabi-Yau. An interesting case is provided by 
the Gepner model associated with the Fermat 
quintic Calabi-Yau 3-fold. 


Arakelov Geometry 


For K a number field and Ox its ring of integers, a 
smooth proper algebraic curve X over K determines 
a smooth minimal model Xo,, which defines an 
arithmetic surface Xo, over Spec(Ox). The closed 
fiber X, of Xo, over a prime p € Ox is given by the 
reduction mod ø. 

When Spec(Ox) is “compactified” by adding the 
Archimedean primes, one can correspondingly 
enlarge the group of divisors on the arithmetic 
surface by adding formal real linear combinations of 
irreducible “closed vertical fibers at infinity.” Such 
fibers are only treated as formal objects. The main 
idea of Arakelov geometry is that it is sufficient to 
work with “infinitesimal neighborhood” X,(C) of 
these fibers, given by the Riemann surfaces obtained 
from the equation defining X over K under the 
embeddings a: K — C that constitute the Archime- 
dean primes. Arakelov developed a consistent inter- 
section theory on arithmetic surfaces, by computing 
the contribution of the Archimedean primes to the 
intersection indices using Hermitian metrics on these 
Riemann surfaces and the Green function of the 
Laplacian. 

A general introduction to the subject of Arakelov 
geometry can be found in Lang (1988). Manin 
(1991) showed that these Green functions can be 
computed in terms of geodesics in a hyperbolic 
3-manifold that has the Riemann surface X,(C) as 
its conformal boundary at infinity. 


The Polyakov measure A first application to 
physics of methods of Arakelov geometry was an 
explicit formula obtained by Beilinson and Manin 
(1986) for the Polyakov bosonic string measure in 
terms of Faltings’s height function at algebraic 
points of the moduli space of curves. 

The partition function for, the closed bosonic 
string has a perturbative expansion Z = » ^ 
with 


Zs = Pacis | ese) DxDy [7] 
P» 


written in terms of a compact Riemann surface X of 
genus g, maps x: X — R^, and metrics y on X. The 
classical action is of the form 


$65) = [ d'z ly 8, x" Oy x" [8] 


Using the invariance of the classical action with 
respect to the semidirect product of diffeomorphisms 
of X and the conformal group, the integral is reduced 
(in the critical dimension d — 26 where the con- 
formal anomaly cancels) to a zeta regularized 
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determinant of the Laplacian for the metric on X 
and an integration over the moduli space M, of genus 
g algebraic curves. Beilinson and Manin gave an 
explicit formula for the resulting Polyakov measure 
on M, using results of Faltings on Arakelov geometry 
of arithmetic surfaces. In particular, their argument 
uses essentially the properties of the Faltings metrics 
on the invertible sheaves d(L) given by the *multi- 
plicative Euler characteristics" of sheaves L of 
relative 1-forms. For a suitable choice of bases [6j] 
and {w;} of differentials and quadratic differentials, 
the formula for the Polyakov measure is then of the 
form (up to a multiplicative constant) 


dz, =|det B| "(det 87) ^ Wi A Wi ^«^ 
W3g-3 ^ W3g-3 [9] 


with 7 in the Siegel upper-half space, Bj; = $ Pis 
and the W; given by the images of the basis w; under 
the Kodaira-Spencer isomorphism. 


Holography In the case of the elliptic curve 
X,(C)—C'/q^, a formula of Alvarez-Gaume, 
Moore, and Vafa gives the operator product expan- 
sion of the path integral for bosonic field theory as 


glz, 1j = log C anian " z 


x I [11 -2421- eeu) [10] 
n=] 


where B; is the second Bernoulli polynomial. 
Expression [8] is in fact the Arakelov Green function 
on X,(C) (cf. Lang (1988)). 

Using this and analogous results for higher genus 
Riemann surfaces, Manin and Marcolli (2001) 
showed that the result of Manin (1991) on Arakelov 
and hyperbolic geometry can be rephrased in terms 
of the AdS/CFT correspondence, or holography 
principle. Expression [8] can then be written as a 
combination of terms involving geodesic lengths in 
the Euclidean BTZ black hole. 

In the case of higher genus curves, the Arakelov 
Green function on a compact Riemann surface, 
which is related to the two-point correlation func- 
tion for bosonic field theory, can be expressed in 
terms of the semiclassical limit of gravity (the 
geodesic propagator) on the bulk space of Euclidean 
versions of asymptotically AdS),; black holes 
introduced by K Krasnov. 


Motives 


There are several cohomology theories for algebraic 
varieties: de Rham, Betti, étale cohomology. de Rham 
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and Betti are related by the period isomorphism, and 
comparison isomorphisms relate Étale and Betti 
cohomology. In the smooth projective case, they 
have the expected properties of Poincaré duality, 
Künneth isomorphisms, etc. Moreover, Étale coho- 
mology provides interesting /-adic representations of 
Gal(k/k). In order to understand what type of 
information, such as maps or operations can be 
transferred from one to another cohomology, 
Grothendieck introduced the idea of the existence of 
a “universal cohomology theory” with realization 
functors to all the known cohomology theories for 
algebraic varieties. He called this the theory of 
“motives.” Properties that can be transferred between 
different cohomology theories are those that exist at 
the motivic level. A short introduction to motives can 
be found in Serre (1992). 

The first constructions of a category of motives 
proposed by Grothendieck covers the case of smooth 
projective varieties. The corresponding motives form 
a Q-linear abelian category of “pure motives." 
Roughly, objects are varieties and morphisms are 
“correspondences” given by algebraic cycles in the 
product, modulo a suitable equivalence relation. The 
category also contains Tate objects generated by 
Q(1), which is the inverse of the pure motive 
H?(P!). Grothendieck's standard conjectures imply 
that the category of pure motives is equivalent to the 
category of representations Rep; of a “motivic 
Galois group,” which in the case of pure motives is 
proreductive. The subcategory of pure Tate motives 
has as motivic Galois group the multiplicative group 
Gy». The situation is more complicated for “mixed 
motives," for which constructions were only very 
recently proposed (e.g., in the work of Voevodsky). 
These provide a universal cohomology theory for 
more general classes of algebraic varieties. Mixed 
Tate motives are the subcategory generated by the 
Tate objects. There is again a motivic Galois group. 
For mixed motives it is an extension of a proreduc- 
tive group by a prounipotent group, with the 
proreductive part coming from pure motives and 
the prounipotent part from the presence of a weight 
filtration on mixed motives. The multiple zeta values 
appear as periods of mixed Tate motives. 


Renormalization and Motivic Galois Theory 


A manifestation of motivic Galois groups in physics 
arises in the context of the Connes-Kreimer theory of 
perturbative renormalization (for an introduction to 
this topic, see Hopf Algebra Structure of Renormaliz- 
able Quantum Field Theory). In fact, according to the 
Connes-Kreimer theory, the Bogoliubov-Parasiuk- 
Hepp-Zimmerman (BPHZ) renormalization scheme 


with dimensional regularization and minimal subtrac- 
tion can be formulated mathematically in terms of the 
Birkhoff factorization 


ylz) = 4- (2) *« (2) [11] 


of loops in a prounipotent Lie group G, which is the 
group of characters of the Hopf algebra of Feynman 
graphs. Here, the loop y is defined on a small 
punctured disk around the critical dimension D, y4 
is holomorphic in a neighborhood of D, and y- is 
holomorphic in the complement of D in P'(C). The 
renormalized value is given by 7,(D) and the 
counterterms by ^. (z). 

The paper of Connes and Marcolli Renormaliza- 
tion, the Riemann-Hilbert Correspondence, and 
Motivic Galois Theory in volume II of Julia et al. 
(2005) shows that the data of the Birkhoff factor- 
ization are equivalently described in terms of 
solutions to a certain class of differential systems 
with irregular singularities. This is obtained by 
writing the terms in the Birkhoff factorization as 
time-ordered exponentials, and then using the fact 
that 


b 
Tel, ©) ted f[o60-.- 


ssi < $5 S. 


orsa ) ds; --- ds, 


is the value g(b) at b of the unique solution g(t) € G 
with value g(a)— 1 of the differential equation 
dg(t) — g(t)a(t) dt. 

The singularity types are specified by physical 
conditions, such as the independence of the counter- 
terms on the mass scale. These conditions are 
expressed geometrically through the notion of 
G-valued *equisingular connections" on a principal 
C'-bundle B over a disk A, where G is the 
prounipotent Lie group of characters of the 
Connes-Kreimer Hopf algebra of Feynman graphs. 
The “equisingularity” condition is the property that 
such a connection w is C’-invariant and that its 
restrictions to sections of the principal bundle that 
agree at 0 € A are mutually equivalent, in the sense 
that they are related by a gauge transformation by a 
G-valued C"-invariant map regular in B; hence, they 
have the same type of (irregular) singularity at the 
origin. 

The classification of equivalence classes of these 
differential systems via the Riemann-Hilbert corre- 
spondence and differential Galois theory yields a 
Galois group U* = UxG,,, where U is prounipotent, 
with Lie algebra the free graded Lie algebra with 
one generator e., in each degree n € N. The group 
U* is identified with the motivic Galois group of 
mixed Tate motives over the cyclotomic ring 
Z|[e?^!/N]. for N 23 or N=4, localized at N. 


Speculations on Arithmetical Physics 


In a lecture written for the 25th Arbeitstagung in 
Bonn, Y Manin presented intriguing connections 
between arithmetic geometry (especially Arakelov 
geometry) and physics. The theme is also discussed 
in Manin (1989). These considerations are based on a 
philosophical viewpoint according to which funda- 
mental physics might, like adeles, have Archimedean 
(real or complex) as well as non-Archimedean 
(p-adic) manifestations. Since adelic objects are 
more fundamental and often simpler than their 
Archimedean components, one can hope to use this 
point of view in order to carry over some computa- 
tion of physical relevance to the non-Archimedean 
side where one can employ number-theoretic methods. 


Adelic physics? Some of the results mentioned in 
the previous sections seem to lend themselves well to 
this adelic interpretation. The quantum statistical 
mechanics of Q-lattices relies fundamentally on 
adeles and it admits generalizations to systems 
associated to other algebraic varieties (Shimura 
varieties) that have an adelic description and adelic 
groups of symmetries. The result on the Polyakov 
measure also has an adelic flavor, as it uses 
essentially the Archimedean component of the 
Faltings height function. The latter is in fact a 
product of contributions from all the Archimedean 
and non-Archimedean places of the field of defini- 
tion of algebraic points in the moduli space, so that 
one can expect that there would be an adelic 
Polyakov measure, of which one normally sees the 
Archimedean side only. The Freund-Witten adelic 
product formula for the Veneziano string amplitude 
fits in the same context, with p-adic amplitudes 


Bo.) — f ixi "Dt —x|p dè 
p 


and B&(o,8) '= [[jBy(a, 6) (cf. 
(2004)). 


Varadarajan 


Adelic physics and motives A similar adelic philo- 
sophy was taken up by other authors, who proposed 
ways of introducing non-Archimedean and adelic 
geometries in quantum physics. A recent survey is 
given in Varadarajan (2004). For instance, Volovich 
(1995) proposed spacetime models based on 
cohomological realizations of motives, with étale 
topology “interpolating” between a proposed non- 
Archimedean geometry at the Planck scale and 
Euclidean geometry at the macroscopic scale. In 
this viewpoint, motivic L-functions appear as parti- 
tion functions and actions of motivic Galois groups 
govern the dynamics. 
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See also: Hopf Algebra Structure of Renormalizable 
Quantum Field Theory; Mirror Symmetry: A Geometric 
Survey; Quantum Ergodicity and Mixing of 
Eigenfunctions; Random Matrix Theory in Physics; 
Regularization for Dynamical Zeta Functions. 
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Introduction 


An operad is an abstraction of a family of composable 
functions of n variables for various n, useful for the 
"bookkeeping" and applications of such families. 
Operads are particularly important and useful in 
categories with a good notion of *homotopy," where 
they play a key role in organizing hierarchies of higher 
homotopies, reflecting their original use as a tool in 
homotopy theory, especially for studying (iterated) 
loop spaces. For several years now, operads have 
become increasingly important in mathematical 
physics, especially in string field theory, where they 
organize the terms of higher order in perturbed 
actions, and in deformation quantization. 

The major focus of this article will be on operads as 
they are relevant to mathematical physics, but will also 
include some background material from homotopy 
theory, where they originated. A borderland where 
homotopy theory and cohomological physics overlap is 
the world of differential graded vector spaces, including 
those of differential forms, ghosts, anti-ghosts, etc., 
sometimes lumped together as BRST theory. Here, as 
elsewhere in contemporary mathematical physics, the 
flow has been in both directions — sometimes physicists 
have discovered or reinvented known mathematics but 
finding new applications, at other times physics has 
suggested new concepts for mathematicians to develop 
further. In the case of operads, they have provided 
general structure for varieties of algebras, some of 
which are novel types contributed by physicists. 

For a reasonably up-to-date introduction and 
survey, consider Markl et al. (2002), although there 
have been many developments since then. Two 
particularly important original works are Boardman 
and Vogt (1973) and May (1972). 


Definitions and Examples 


The term *operad" is due to May, building on work 
of Stasheff and of Boardman-Vogt. The most 


fundamental example of an operad is the endo- 
morphism operad £ndy:— (Map(X", X)],., where, 
for a set or topological space X, (Map(X", X)} 
means the set or space of functions or continuous 
functions from the z-fold product of X with itself to 
X, together with the operations 


o;: Map(X", X) x Map(X", X) — Map(X"*"-. X) 


given, for 1 < i € n, by 


= (x1 gre meg TAIN,- K = oup es os) 


In the endomorphism operad Endy, there are 
easily discovered relations involving iterated oj- 
operations and the symmetric group X, actions on 
the X"s. For example, 


(f oig) oj b = f oj (8 oj-i+1 P) 
fori<j<itm-—1 


if gis a function of m variables, since only the name 
of the position for the insertion is changed. 

An operad (QO, 0;) consists of a collection 
(O(n)),., of objects and maps 9: O(n) x O(m) — 
O(n 4- m — 1) for m,n > 1 and i € n satisfying the 
relations manifest in the example Endy. 

May’s original definition corresponds to simulta- 
neous insertions into all possible positions of inputs 
into f € Map(X", X). In most examples, the struc- 
tures are *manifest" without appeal to the technical 
definitions. 

It helps to see graphic examples of operads, 
particularly ones relevant for physics. Two kinds 
that are particularly important are the tree operads 
and the little disks (or cubes) operads. 

Let 7 (n) be the set of planar trees with one root 
and n leaves labeled (arbitrarily) 1 through n. The 
collection 7 —(7 (n)),-, of sets of trees forms an 
operad by grafting the root of g to the leaf of f 
labeled 1, as in Figure 1, where the leaves are 
assumed labeled in order from left to right. Figure 1 
can be interpreted as portraying the o4 result of 
inserting a 3-linear operation into a 5-linear one. 

The little -disks operad D,—(D,(j);., where 
D,(j) consists of an ordered collection of ïj n-disks 
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Vevey 


Figure 1 Grafting with the leaves numbered from left to right. 


Qe). ®\ (Gg 
Q9)«(5*)- € 


Figure 2 The little 2-disks operad. 


embedded in the standard n-dimensional unit disk 
D" with disjoint interiors, the embedding being of 
the form az + b with 0 < a € R. The operations are 
given as indicated in Figure 2. 

Just as group theory without representations is 
rather sterile, so are operads best appreciated by 
their representations known as (varieties of) alge- 
bras, especially algebras with higher homotopies. 

An algebra A over an operad $8 “is” a map of 
operad $B— Enda. This is just a compact way 
of saying that an algebra A has a coherent system 
of maps $B(z) x A" — A. Much of this article will 
speak in terms of such algebras with the correspond- 
ing operad being understood. 


Operads in Homotopy Theory 


A major motivation for the development of operads 
was the desire to have a homotopy-invariant char- 
acterization of based loop spaces and iterated loop 
spaces. Precisely such coherent systems of higher 
homotopies provided the answers. For based loop 
spaces, the operad in question K = {K,,},,., consists of 
the polytopes known as “associahedra.” The usual 
product of based loops is only homotopy associative. 

If we fix a specific associating homotopy and 
consider the five ways of parenthesizing the product 
of four loops, there results a pentagon whose edges 
correspond to a path of loops (Figure 3). 

From the leftmost vertex to the rightmost, consider 
the two paths of loops across the top or around the 
bottom. By further adjustment of parameters, the 
pentagon can be filled in by a family of such paths. 

The associahedron K, can be described as a 
convex polytope with one vertex for each way of 
associating n ordered variables, that is, ways of 
inserting parentheses in a meaningful way in a word 


(ab)(cd) 


a(b(cd)) ((ab)c)d 


a((bc)d) 
Figure 3 The associahedron K;. 


(a(bc))d 


of n letters. The edges correspond to a single 
application of an associating homotopy. More 
generally, the cellular structure of the associahedra 
is well described by planar rooted trees, the vertices 
corresponding to binary trees and so forth (see 
Figure 4). 

For Ks, see Figure 5 or.a rotatable image available at 
http://igd.univ-lyon1.fr/~ chapoton/stasheff.html. The 
facets are all products of two associahedra of lower 
dimension and specific imbeddings can be given to play 
the role of the o; operations as in an operad. 

An A,.-space is a space Y which admits a coherent 
family of maps 


m,: K, x Y" 2 Y 


so that they make Y an algebra over the operad 
(without 2X,-actions) K = {K,},,>. 

The main result by Stasheff is: A connected space 
Y (of the homotopy type of a CW-complex) has the 
homotopy type of a based loop space OX for some 
X if and only if Y is an A,,-space. 

Homotopy characterization of iterated loop 
spaces Q”X,„ for some space X, required the full 
power of the theory of operads with the symmetries. 


NX NV 


Figure 4 K, with vertices labeled by trees. 


Figure 5 The associahedron Ks. 


An early motivation for the invention of a theory 
of operads was the consideration of infinite loop 
spaces, that is, a sequence of spaces X, such that 
each X, is homotopy equivalent to X, . 

Although introduced originally in the category of 
topological spaces, operads were available almost 
immediately for differential graded (dg) vector 
spaces, also known as chain complexes. In physics, 
the differential is often called a BRST operator, a 
term that should be reserved for a special kind of dg 
algebra, see below. 


Operads in Algebra 


The o; notation first appeared in Gerstenhaber's study 
of the algebraic structure of the Hochschild cohomol- 
ogy of an associative algebra, about the same time as 
the construction of the associahedra where the opera- 
tions were given in a less convenient notation. Recall 
the Hochschild cohomology of an associative algebra 
A is the homology of the complex Hom(A*", A) with 
the coboundary given as follows (all signs below are 
indicated as +, any of the standard references will 
specify conventions and signs): for f € Hom(A®”, A) 
and g € Hom(A*?"', A) let 


fog=Xi+tfog H 
Gerstenhaber then defines his bracket as [f, g] =f o g 


+gof. With hindsight, he realized that the 
Hochschild coboundary can be written as 


5h = |m, b| i2] 


where m:A $9 A—4A is the multiplication. More- 
over, the associativity of m is equivalent to 


[m,m] = 0 i3] 


A. -Algebras 


In the setting of graded vector spaces V = &,cz V", 
there are two conventions for defining A,,-algebras, 
which differ by a shift in grading. We adopt the 
physics convention so that A here is the suspension 
of that considered in the original papers. The 
cellular chains of the associahedra form the A,.- 
operad, providing the following definition. 


Definition 1 A-algebra (Strong homotopy asso- 
ciative algebra). Let A be a Z-graded vector space 
A= ®,ez A’ and suppose that there exists a collec- 
tion of degree 1 multilinear maps 


m:= (m, : A — A} p> 
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(A, m) is called an A-algebra when the multilinear 
maps m, satisfy the following relations: 


p 
D Ey 0; Mg = 0 [4] 


p+g=n+1 i=1 
with an appropriate set of signs for n > 1. 


A weak A-algebra consists of a collection of 
degree 1 multilinear maps 


mm :— {mp : A®* — Aj ps0 


satisfying the above relations, but for n > 0 and in 
particular with k,l > 0. 


Remark 1 The “weak” version is fairly new, 
inspired by physics, where 7:9: C — A, regarded as 
an element mọo(1) € A, is related to what physicists 
refer to as a “background.” The augmented relation 
then implies that 7mọ(1) is a cycle, but 7217 need no 
longer be 0, rather 


mimi = + m»(mo G9 1) s m»(1 &) p) [5] 


Just as associativity was captured by the equation 
[71, 71] — 0, so the defining relations of the definition 
of an A,-algebra are captured by 


Im, m] = 0 [6] 


Decades later it was realized that considering 
T*A — XA*" as a coalgebra with 


A(ai ® ---G an) = Uprg + (a1 @ ++: @ ap) 
& (Api1 @ +++ @ an) 


we then have an isomorphism 
XHom(A*", A) ~ Coder(T*A) 


Here Coder is the space of all coderivations of T*A. 
The Gerstenhaber bracket is indeed the “intrinsic” 
commutator bracket of coderivations via the above 
isomorphism. As such, it satisfies a graded version of 
the Jacobi identity; after a shift in grading from the 
original one of Hochschild, the Hochschild cochain 
complex forms a dg Lie algebra. 


L..-Algebras 


Since an ordinary Lie algebra g is regarded as 
ungraded, the defining bracket is regarded as skew- 
symmetric. For dg Lie algebras and L,,-algebras, we 
need graded symmetry, which refers to symmetry with 
signs determined by the grading. The basic operation is 


r:xGyex(-1) y @x [7] 


Also we adopt the convention that tensor products 
of graded functions or operators have the signs built 
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in; for example, (f & g)(x & y) = (- 1)* * f(x) & g(y). 
By decomposing each permutation as a product of 
transpositions, there is then defined the sign of a 
permutation of n graded elements, for example, for 
any c; € V,1 € i € n, and any o € G,, the permuta- 
tion of » graded elements is defined by 


Cn) = (71) 7 (e51),...,€s0)) [8] 


The sign (—1) is often referred to as the Koszul 
sign of the permutation. 


7 i 


Definition 2 (Graded symmetry). A graded sym- 
metric multilinear map of a graded vector space V to 
itself is a linear map f : V^" — V such that for any 
cj € V, 1€ i € n, and any o € ©, (the permutation 
group of n elements), the relation 


f (ci. € dag Cn) = (—Ly "fca, «cag Corus) [9] 
holds. 


Definition 3 By a (k,/)-unshuffle of c1,..., Cn with 
n=k + lis meant a permutation ø such that for i < 
j € k, we have o(i) < o(j) and similarly for k < i < 
j € k +1. We denote the subset of (k, l)-unshuffles in 
Ska by €; and by 6,,,..,, the union of the subsets 
O, ı with k + /=n. Similarly, a (k1, . . . , &;)-unshuffle 
means a permutation o € G, with » —k, +---+k; 
such that the order is preserved within each block of 
length &;,..., À;. The subset of G, consisting of all 
such unshuffles we denote by €, k; 


Definition 4 L..-algebra (Strong homotopy Lie 
algebra). Let L be a graded vector space and suppose 
that a collection of degree 1 graded symmetric linear 
maps [:— {l : LS% — L},., is given. (L, lU) is called an 
L,.-algebra iff the maps satisfy the following relations: 


`o (—1) hare (Corr)s "ee ; Co(k)): 


FESR =n 
[Co(k+1); T Cats) $i, [10] 
for n > 1. 


A weak Lẹ-algebra consists of a collection of 
degree 1 graded symmetric linear maps 
[:= {lp : L^ — L}).9 satisfying the above relations, 
but for n > 0 and with k,l > 0. 


Remark 2 The alternate definition in which the 
summation is over all permutations, rather than just 
unshuffles, requires the inclusion of appropriate 
coefficients involving factorials. 


Just as an A,-algebra can be described as a 
coderivation of T*A, similarly an L,.-algebra L can 
be described as a coderivation on S°L, the symmetric 
subcoalgebra of T*A. 


The operad of Lie algebras was defined rather 
late, although it was earlier implicit in the work of 
Fred Cohen. It is defined as the homology 
H, 1(Config( R2, n)) for n > 1, where Config( R?, n) 
denotes the configuration space of ordered n-tuples 
of distinct points in R7. Equivalently, the configura- 
tions can be thought of as the centers of the little 
2-disks. The open disks being contractible to their 
centers, this is a suboperad of the full homology 
H,(D2). 

Just as a Lie algebra is obtained from an 
associative algebra using the commutator as bracket 
and, inversely, a Lie algebra gives rise to its 
universal enveloping associative algebra, an 
L4-algebra can be obtained from an A-algebra 
by z-variable analogs of commutators and there 
is a universal enveloping A-algebra of a given 
L4.-algebra. 


Open-Closed Homotopy Algebras 


Open-closed string field theory suggests interaction 
between an L,-algebra He and an A-algebra Ho 
including a strong homotopy representation of He 
on H, by strong homotopy derivations. Here is the 
formal definition: 


Definition 5 Let 4 — A, ® He be a graded vector 
space and (H,,1) be a weak L,.-algebra. Consider a 
collection of multilinear maps 


Dk Qi 
n:— {nki :(Ho) O (He)” > Ho trk sso 


each of which is graded symmetric on (H,)®™'. We 
denote the collection also by n. We call (75, n,l) a 
(partial) open-closed homotopy algebra (OCHA) 
when n satisfies the following relations (up to some 
factorial coefficients): 


m—k 


kl>0 p=0 oE, 


Np |(Op+1 greeny Op+ks Coll), oes ; Ca(l))s 
Op--k--1: ——* 02771 Co(I43-1): oe 893 Gain) 
n 
T ‘x d: Hori J (OF, e Om; 
oE, l=1 
li(co(1)» LER. Co(1)); Co(I--1)s * ** 1 Cotes) 1 1| 


Other Algebras of Interest 


The Hochschild complex also has a graded product 
(without invoking the shift) known as the cup 
product. Except for the signs and the grading, the 
bracket and the product satisfy the Leibniz rule of a 
Poisson algebra on the cohomology; the result is 


axiomatized as a “Gerstenhaber algebra." However, 
on the cochain complex, the Lie bracket and the 
associative product are compatible only up to 
homotopy. 

This naturally raises the issue of an operad for 
strong homotopy Gerstenhaber algebras. The operad 
G for Gerstenhaber algebras is the homology of the 
little disks operad, H.(D;). But now we have 
choices: in addition to relaxing the Leibniz rule up 
to homotopy, the bracket could be relaxed to be 
part of an L,-algebra and/or the product could be 
relaxed to be part of an A,-algebra. The choice 
which is now known as the G.,-operad is defined in 
terms of a procedure which works for what are 
known as quadratic operads, indicating they have 
generators in O(2) and relations in O(3): the 
corresponding O,, has “dual” relations. For exam- 
ple, this gives the classical Koszul duality between 
Lie and commutative associative algebras. The G- 
operad can also be described as the “minimal 
model" of G in the sense of Markl. 

Another alternative is to consider just the *brace" 
operations, originally introduced by Kadeishvili and 
later independently by Getzler, but described in the 
Hochschild complex setting by | Gerstenhaber- 
Voronov. Together with the cup product, these 
determine an operad denoted HG which acts on the 
Hochschild complex; there is an operad map from 
Gx to HG, hence G4. also acts on the Hochschild 
complex. Finally, Tamarkin showed that G, is 
quasi-isomorphic to the dg operad of singular chains 
on the little disks operad, thus providing one of 
several proofs of what had been a conjecture by 
Deligne. 

Algebras with invariant inner products «—,— > 
are of considerable importance in mathematics and 
especially in mathematical physics; invariance means 
< pcr = gap. c> or <4, [b,c] 9 = «[2,5],€2 in, 
respectively, the associative: or the Lie case (with 
appropriate signs in the graded case). Using the 
inner product, n-ary operations A^" — A can be 
converted to operations A?"*! — C of which we can 
require cyclic symmetry. To handle such algebras, 
there is a notion of "cyclic operad." In terms of 
trees, the transition is to take a rooted tree and then 
regard the root edge as just another leaf. This point 
of view corresponds to an essential symmetry for 
particle interactions. 


Operads in Mathematical Physics 


One reason for the explosive development of operad 
theory in the 1990s was the introduction of operadic 
structures in field theories, for example, conformal 


field theories (CFTs) and string field theories (SFTs). 
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These operadic structures were directly related to 
the moduli spaces of Riemann surfaces with punc- 
tures or boundaries (or other decorations) in these 
physical theories. 

Two special “higher-homotopy algebras” have 
been emphasized because they are particularly 
important in mathematical physics: A, for open- 
string field theory and Lẹ for closed-string field 
theory and for deformation quantization. Open- 
closed string field theory combines A,,-algebra and 
L4;-algebra in a particular way known as an OCHA. 

The operad for L,.-algebras is given a very nice 
and physically relevant geometric interpretation in 
terms of a real compactification of the moduli space 
of Riemann spheres with punctures, while for 
OCHAs, there is a real compactification of the 
moduli space of Riemann disks with punctures on 
the boundary or in the interior (bulk). Thus, this 
operad can be regarded as obtained from a moduli 
space of configurations of points (punctures) in the 
disk by compactifying the moduli spaces by adding 
boundary strata where two (or more) points 
(punctures) collide. Points on the boundary strata 
can be visualized as “bubble trees” of disks and/or 
spheres, see Figure 6. Alternatively, the little disks 
operad can be regarded as being obtained by 
“decorating” the points with little disks, while for 
OCHAs there is also a basic half-disk decorated 
with little disks in the bulk and little half-disks for 
the boundary points. The corresponding colored 
operad is  Voronov's “Swiss-cheese  operad." 
“Colored” refers to the fact that disks can be 
inserted into half-disks but not vice versa. Compare 
trees with two “colors” of edges with grafts 
restricted to ones which match colors. 


On-Shell versus Off-Shell 


In cohomological physics, the *on-shell" states or 
observables are usually given by the cohomology 
with respect to an internal differential, which in 
physics is called the BRST differential or BRST operator, 
though originally this meant the Chevalley—Eilenberg 
differential associated to the action of the Lie algebra of 


Figure 6 Bubble tree for circle configurations. 
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gauge symmetries of a physical theory. The generators 
of the Chevalley—Eilenberg cochain complex are known 
as “ghosts”. On-shell subspaces of algebras which are 
not closed under the product of the larger “off-shell” 
algebra are called “open” algebras by physicists. Quite 
generally, this situation gives rise to an algebra over an 
appropriate operad. A special case involves a differential 
graded algebra A and a linear imbedding H(A) C A. 
The (co)homology is in turn a graded algebra (with 0 as 
differential), but inherits a higher-homotopy structure 
so that cohomology and original algebra are equivalent. 

In the associative case, the inheritance is a result 
of Kadeishvili: 


Let (A, d) be a differential graded associative or 
A-algebra, then the homology H(A) inherits the 
structure of an A,,-algebra. 


Even if the original algebra A is strictly associa- 
tive, the inherited A,.-structure generally has non- 
trivial operations 7j. 

Analogous results hold for L.-algebras and 
others. It is the L.-version that is relevant for 
closed-string field theory (CSFT). Zwiebach showed 
the quantum theory of covariant closed strings has 
an action defined in terms of an infinite chain of 
string field products. The genus-0 (tree level) string 
field algebra is an L4.-algebra inherited from the off- 
shell state space modeled by the Batalin-Vilkovisky 
(BV) construction. The higher-order brackets pro- 
vide higher-order correlation or z-point functions 
which play a crucial role in the extended Lagrangian 
of the theory. 


Batalin-Fradkin-Vilkovisky and Batalin-Vilkovisky 
Constructions 


The constructions of Batalin-Fradkin-Vilkovisky 
(BFV) for constrained Hamiltonian systems and of 
Batalin-Vilkovisky (BV) for Lagrangians with sym- 
metries are important examples of L..-structures 
derived from “open” algebra settings, though the 
L..-Sstructures were recognized quite a while after 
the constructions. 

The BFV setting is that of a symplectic manifold 
W with a family of constraints, that is, a family of 
functions 6* € C*(W). The constraints are called 
“first class” if the ideal they generate is closed under 
the Poisson bracket. The vector space spanned by 
the constraints will in general be an open algebra; 
the structure of the bracket is given by structure 
functions, rather than structure constants. The zero 
locus of all the constraints forms the constraint 
surface V. In the first-class case, the constraints are 
in involution and determine a foliation F of V. If the 
space of leaves V/F is a manifold, it would be 


considered the true physical space and the physical 
observables would be functions in C*(V/7). BFV 
construct a differential graded Poisson algebra such 
that the cohomology in degree O0 agrees with 
C?*(V/F) when that makes sense and, in the regular 
case, the rest of the cohomology is that of the 
differential forms along the leaves of the foliation. 
The BFV differential is a deformation of the 
Chevalley-Eilenberg/BRST differential and can be 
constructed most efficiently by the same techniques 
used in proving Kadeishivili’s inheritance theorem. 
Crucially, it is an inner derivation with respect to 
the Poisson bracket. After the fact, an L..-structure 
can be observed in the extended algebra. 

For a Lagrangian with symmetries, BV develop a 
similar construction, the main difference being that 
there is no Poisson bracket initially, but one is 
constructed by adjoining “anti-fields” as conjugate to 
the fields but of ghost degree —1 and the differential of 
an anti-field being the Euler-Lagrange expression for 
the corresponding field. Then, as in the Hamiltonian 
case, ghosts and anti-ghosts, etc. are adjoined and the 
construction proceeds in a parallel fashion. 


Deformation Quantization 


Once algebras over an operad §B are considered, it is 
natural to consider also morphisms of such algebras 
over a fixed $B. 

From a homotopy point of view, the appropriate 
maps need not respect the operad structure strictly 
but only up to higher homotopy; indeed, there is a 
related operad to define such maps. For Læ- 
algebras, such L,.-maps play a key role in deforma- 
tion quantization. That refers to deformation of the 
commutative multiplication of a Poisson algebra in 
the direction of the Poisson bracket; that is, to first 
order, the deformation is given by the bracket. 

More generally, for any associative algebra A with 
multiplication m, one considers formal deformations 


ax b = m(a,b) + tmi(a, b) + t^ma;(a,b) +--+ [12] 


where each m; € Hom(A & A, A). The associativity 
of x provides a sequence of constraints on the m;. In 
particular, #2; must be a Hochschild cocycle and the 
obstruction to the existence of 775; is a class in the 
Hochschild cohomology of degree 3. In fact, the 
primary obstruction is represented by |m, mı]. If it 
is cohomologous to zero, that fact identifies candi- 
dates for 715, that is, 


[mi.mi| = +2[m, m2] [13] 
or, using the notation d=[m,], 


dm; —1/2|[mi,mi| = 0 [14] 


once known as the integrability equation but now, 
more frequently, as a Maurer-Cartan equation. For 
a Poisson algebra, the Poisson bracket is a Hochs- 
child cocycle but in general a full deformation need 
not exist. However, for the algebra A of smooth 
functions on a Poisson (e.g., symplectic) manifold 
M, Kontsevich showed that such a full formal 
deformation does exist. 

The guiding philosophy is that deformations are 
controlled by a dg Lie or L4,-algebra L, unique up to 
L..-homotopy equivalence. Therefore, the obstruc- 
tions can be computed in any of the equivalent dg Lie 
algebras. Moreover, the structure of the obstructions 
is known sufficiently so that if there is an equivalent 
dg Lie algebra with d in fact zero, then all the 
obstructions to deformation quantization vanish. The 
key to Kontsevich’s proof was the construction of an 
L.-map, inducing an isomorphism in cohomology, 
from the Lie algebra of polyvector fields on R^ with 
the Schouten bracket and d=0 to the Lie algebra of 
multidifferential operators on A = C®(R“) regarded as 
a subalgebra of the Hochschild cochain complex for A 
with the Gerstenhaber bracket. 


BV Algebras 


In addition to their construction of a differential 
graded Gerstenhaber algebra (a differential graded 
commutative algebra with a compatible Poisson 
bracket of degree 1), BV introduced a new mathe- 
matical structure, adding a second-order differential 
operator A relating the commutative product and 
the bracket. The operator A is a derivation of the 
bracket and of square zero. Moreover, 


la, b] = A(ab) — A(a)b + aA(b) [15] 


so that the failure of A to be a derivation of the 
product is given by the bracket. 

The definition of a BV algebra is then a Gerstenhaber 
algebra with such an operator, though alternative 
definitions exist in which A and the product are 
primary and the bracket is defined by the above 
equation. From the operadic/higher-homotopy point 
of view, one can then go on to consider BV» algebras. 

Recall that A,.-algebras and L,,-algebras (among 
others) can be characterized by an “inner” coderiva- 
tion d—[m,] of square zero on an appropriate 
“standard” construction. In the context of BV 
algebras, where the bracket is more commonly 
written as {,}, the classical action is an element $9 
such that ($0, $5] — 0 or, equivalently, d = (So, } is of 
square zero. The quantum analog S is a perturbation 
of Sy and satisfies instead 


[8,8] = AS [16] 
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This was originally called the “master equation,” 
but now is increasingly referred to as a “Maurer-— 
Cartan" equation. 


Insertion Operads 


There is another class of operads illustrated by trees 
(and more generally graphs) with a very different 
sort of “composition,” namely insertion of one 
graph into another. The most directly relevant to 
physics is the kind of insertion used by Connes and 
Kreimer in their Hopf algebra constructed for 
renormalization of Feynman diagrams. For example, 
consider all finite graphs with exactly two external 
edges and internal numbered edges. Given two 
graphs li, I5, define Ty o; L'?; by cutting edge i of 
I, and identifying the dangling edges with the two 
external edges of T2. 

For planar trees, yet another insertion operad is 
obtained by Chapoton, isolating a part of a structure 
due to Kontsevich, in which a small neighborhood 
of a vertex of the second planar tree is removed and 
the dangling edges are attached to a vertex of the 
first tree by entering through the angles between the 
edges at that vertex (Figure 7). 

Inside the HG-operad is the operad Brace for an 
abstract brace algebra (forgetting the cup product), 
first described as such by Chapoton using the 
insertion operations of Kontsevich and Soibelman. 


A,.-Categories 


Also of importance for applications to mathematical 
physics is the notion of an A,.-category, first made 
explicit by Fukaya and now playing a major role in 
string D-brane theory and homological mirror 
symmetry. The D-branes are the objects of the A- 
category and the open strings with boundaries on 
two (possibly equal) D-branes Bı,B2 are the 
morphisms from Bı to B2. The operations m; are 
defined only on tuples (a1,...,2;) of *composable" 
morphisms (e.g., strings). 


PROPs 


While an operad is an abstraction of a family of 
composable functions of n variables for various n, a 
PROP is an abstraction of a family of functions in 


Figure 7 Angles determined by edges with leaves extended to 
the semicircle. 
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Hom(A*?, A*4) for all p and q. Now the relevant 
images are graphs with p input legs and g output 
legs with composition being defined by grafting 
output legs of one graph to inputs of another. 
Feynman diagrams are the obvious example in 
physics or, in conformal field theory, tubular 
neighborhoods of such graphs, which is to say, 
Riemann surfaces with boundary circles: p as 
inputs and q as outputs. 


See also: Algebraic Approach to Quantum Field Theory; 
Batalin-Vilkovisky Quantization; Constrained Systems; 
Deformations of the Poisson Bracket on a Symplectic 
Manifold; Deformation Quantization; Deformation Theory; 
Hopf Algebra Structure of Renormalizable Quantum Field 
Theory; String Field Theory. 
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Introduction 


The operator product expansion (OPE) provides an 
algebraic structure in quantum field theory. In a 
sense it supercedes or rather transcends the equal- 
time commutation relations, which provide the 
traditional starting point for the canonical quantiza- 
tion of any quantum field theory. The essential idea 
is that for any two local operator quantum fields at 
spacetime points x1,x2 their product may be 
expressed in terms of a series of other local quantum 
fields at a point x, which may be identified with x; 
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Quantum Field Theory 


or x2, times c-number coefficient functions which 
depend on x, — x2. The set of operators which may 
appear depends on the particular quantum field 
theory and must of course be in accord with any 
requirements of conserved quantum numbers. The 
coefficient functions depend on x; — x2 in a fashion 
which depends on the dimensions of the various 
operators involved, at least up to renormalization 
group corrections. The most singular contributions 
are those for the operators appearing in the OPE 
with lowest scale dimension. From a phenomenolo- 
gical point of view, only the first few terms in the 
OPE are of relevance. However, theoretically, 
especially for conformal field theories, it is desirable 
to know the full expansion to all orders in powers of 
xı — x2? in such a way that the operator product may 
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be replaced by the full expansion in appropriate 
correlation functions. We first discuss the OPE for 
free theories and then the interacting case. 


Free Field Theory 


The OPE is most straightforward in free field theory 
when it almost reduces to a Taylor series expansion. 
For a simple free massless scalar field (x) then in 
four dimensions we may write 


ó(x)0(0) = — -- :o(x)ó(0) " 
where denotes normal ordering (moving all 
annihilation operators to the right of creation 
operators) and C is just a normalization numerical 
constant (for canonical normalization C= 1/47*). 
The 1/x? term proportional to the identity operator 
reflects the leading singular behavior at short 
distances of ¢(x)@(0), the power being determined 
by ó having dimension 1. For the normal-ordered 
term we may expand in terms of an infinite set of 
local operators by using the Taylor expansion 


é ü] 


1 | 
:ó(x)o(0): — 2,7 Ox :0, «0, $(0)9(0): [2] 


n=O 


where the operator appearing in the mth term has 
dimension z + 2. Manifestly at short distances only the 
leading terms are relevant. Equation [1] also provides a 
point splitting definition. of the local composite 
operator :6*(0): in terms of limit of ó(x)ó(0) as x > 0 
after subtraction of the singular C/x? term. 

The OPE can be easily generalized to composite 
operators defined by normal ordering. For :47: we 
have, by applying Wick's theorem, 


+ : 4 (x)? (0): [3] 


where Taylor series expansion may be applied to both 
:ó(x)ó(0): and also :$?(x)9?(0): to give an infinite 
sequence of local operators of increasing dimensions. 

The expansion in terms of local operators may be 
reordered. For instance, from [1] we may write, 
using 0^9 — 0, 


| G 
ó(x)o(0) = m 
t (1 - 1x", - I1x"x"0,0, - x^ 0^) : 4^ (0): 
—Ix'x" T, + O(x?) [4] 
where 
Tu =: J pOH: -i Tluv :O@ i Oo: [5] 


is the energy-momentum tensor. In [4], and also in a 
similar context subsequently, we define 0:¢(0): = 
Oy :Q* (y): ly-o: The expansion [4] provides a point 
splitting definition of T;,, and also demonstrates that 
many operators appearing in the OPE are expres- 
sible in terms of overall derivatives of lower- 
dimension operators. We may also note that without 
further input there is an ambiguity in the definition 


of Typ of the form 
Tw ~ Ty + a(OnO, — 4 MwO7):¢? : [6] 


In a conformal 
a=-—1/6. 


theory, however, we require 


Interacting Theories 


The OPE becomes an essential tool in the context 
of interacting quantum field theories. For renorma- 
lizable quantum field theories various results can be 
proved to all orders in the standard perturbative 
expansion and are naturally assumed to be proper- 
ties of the complete theory. In interacting theories 
we may no longer use normal ordering to define 
composite operators which, in general, have anom- 
alous dimensions. The coefficient functions appear- 
ing in the OPE also gain perturbative corrections but 
these are constrained by renormalization group 
(RG) Callan-Symanzik equations. 

Again if we consider the simplest case of a massless 
scalar theory as above but now with a renormalized 
coupling constant g the leading terms in the expan- 
sion of ó(x)9(0) are of the form (here we assume a Z; 
symmetry under ó — —@, otherwise the operator ó 
would be expected to appear in the OPE) 


C(g, u^x^) 


x2 t D(g, u^ x^)9* (0) aans [7] 


ó(x)ó(0) = 
where u is an arbitrary renormalization scale. This 
arbitrariness is reflected in the RG equation 


LIR 
YOu 


At a fixed point 5(g.)—O this equation may be 
solved with an arbitrary choice of normalization to 
give C(g,, u?x?) = (px?) VE), which corresponds 
to the fields @ having a modified scale dimension 
1+~7(g.). In a similar fashion the coefficient 
D(g, u^x?) in [7] satisfies 


Ble) £254 D) C(g,uhxi)- 0 [8] 


G að 
(n+ Be) E+ 2l) - vel) ) 
x D(g, u^x^) 20 [9] 


where it is necessary to introduce a new anomalous 
dimension function y,2(g) related to the composite 
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operator $?. Although it is natural to label the 
operator as ¢ its definition in terms of the 
elementary field @ is essentially only as given in 
terms of the OPE [9]. At a fixed point again 
D(g., 2x2) =k(p 2x2) Yo(g)H-(0/2)y a (g. 3 where the 
coefficient k is determined by he scale of the 
three-point function (ġ(x)ọ(y)¢?(0)). In asymptoti- 
cally free theories the RG equations show that at 
short distances the coefficient functions tend to 
those of free field theory but with calculable 
logarithmic corrections. More generally, for a set 
of operators {O;} the OPE has the form 


O;(x)O;(0) ~ ipi. Ca(g,ihx1)O,(0) — [10 
k 


where p is determined by the free scale dimensions 
of the O; and 


o o 
(: Ha, + Ps g) a) wi 
zx gr "kn ( 


= Vin(Z) Cink (8; u^ x^)) [11] 


with »j,(g) the anomalous dimension matrix arising 
from the mixing of composite operators. 

An important aspect of the OPE is that the 
coefficient functions may be calculated perturbatively, 
essentially by applying the OPE in some suitable 
correlation function. Essentially the OPE provides a 
factorization between short-distance UV singularities 
and nonperturbative effects. In a Feynman graph the 
short distances in an operator product correspond to 
the large-momentum behavior and power-counting 
theorems allow a factorization up to calculable 
logarithmic corrections. A detailed analysis depends 
on the detailed technicalities of the proofs of renorma- 
lization to all orders of perturbation theory. 

The coefficient functions in the OPE should be 
independent of any infrared or nonperturbative long- 
distance effects (such as confinement in QCD). 
However, the operators which appear in the OPE, 
such as @* above, may have nonzero expectation 
values which are absent to all orders in perturbation 
theory. 


Gn (g, i x >) — ^lin(£) Cujk (g, u^ x^) 


Perturbative Example 


The general considerations can be illustrated by 
considering a scalar field theory to lowest order in a 
perturbative expansion. We consider a four dimen- 
sional theory with a single scalar field and a 
potential V(¢) = 5m*¢* -- 3; gó^. Using dimensional 
regularization m*, as well as g, is treated as a 


coupling with an associated (-function ,2(g)m?. 


With a mass term the operator ¢* mixes with the 
identity operator so that 


(D 4-4 (2))(0^(0)) = —yYy21(g)mn? 


ð 7 o0 uzy 
D- Ha, + Als g) gg ^ "eom ‘a 


where 7,2; reflects the mixing. At one loop order we 
have 


3g 8 

1672’ 1672’ 
and we may also set y4(g)=0. In this case in the 
operator product expansion (7) the coefficient C 
also depends on mx and the RG equations [8] and 
[9] are now modified to include the effects of mixing 


B(g) = 


1 
Yo (8) = Yo (Z) m [13] 


DC(g, m^x^, u^ x^) = m?x*y42;(g)D(g, iA x?) 
(D — 3 (8)) Dlg, i x^) = 0 


From lowest order perturbation theory with [13], 


and using [14] to include all orders in gln 7x7, we 


have in this approximation 


t 2 "un 


«(Gs site)" -1) [15 


de 1/3 
D(g,u^x^) = ur In u^x i 


[14] 


C(g, n?x^, pa’) 
1 2m? x? 


The operator product expansion then reproduces the 
small x behavior of the two point function (¢(x)@(0)) at 
one loop, expanding C, D to first order in g, if we take 


2 
(6(0)--25n ^ -O(g [I6 


which is in accord with [12]. If m4 < 0 the symmetry 
Q «+ —ó is broken and it is necessary to shift the field 
ġ=v+ f, with v? = —6m^/g and the field f has a 
mass mp with m?=—2m*. The operator product 
expansion [7] with the same coefficient functions as in 
[15] remains valid. The two point function (ó(x)ó(0)), 


which includes a nonperturbative term v^, is again 
reproduced for small x at one loop now if 
6m^ m^, yu 
2 
= ——— —-———ln— 17 


but in this case it is necessary to expand D(g, ui^ x?) 
to O(g^) as a consequence of the leading 1/g term in 
[17]. Note that both [16] and [17] contain the 
nonperturbative dependance on lnm and Inm; 
which is present in the two point function. 
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Conformal Field Theories 


When the (-function vanishes and a quantum field 
theory enjoys conformal invariance the operator 
product expansion is a potentially convergent 
expansion. It is natural to restrict to conformal 
quasiprimary operators which do not mix with 
lower scale dimensions under conformal transforma- 
tions. If we consider, for instance, two scalar 
operators ó with scale dimension A, then the OPE 
has the generic form 


1 1 
— C, ! 
x Ao Ds 99 (x2)"2(2A4 — Al +0) 
xi dex, gyi — mo! (0) [18] 


I iW 


ó(x)o(0) = 


where there is a sum over quasiprimary operators 
o u With scale dimension A! and spin /, so they 
are symmetric traceless tensors of rank /. In the first 
term in [18] the coefficient is chosen to be 1 by a 
choice of normalization. The coefficients Co, 
with a standard normalization for O!, are then 
determined by the coefficients of the corresponding 
three-point functions involving ġġ and O!. In [18] 
C are differential operators which sum up the 
contributions of all derivatives or descendants of the 
quasiprimary operator Ol. They can be explicitly 
given in terms of an integral representation, for any 
spacetime dimension, where the scale is fixed by 
requiring. for the leading term Cx, 0)^ jan Me — 
x^'...x/" — traces. The spectrum of operators 
which appear is obviously a property of the 
particular conformal field theory. 


Ward Identities 


If the theory has a symmetry with corresponding 
conserved currents then there are Ward identities 
which constrain the OPE of fields with the con- 
served current. For a current /,; then we have, in 
d dimensions, the singular contribution in the OPE 
is given by 


L Xu  O(0) [19 


Jua(x)O(0) i -Sa (x2) 24 a 


where ¢, are a set of matrix generators correspond- 
ing to the symmetry acting on the fields O and $, is 
the volume of the unit (d — 1)-dimensional sphere, 
S4=27*. For a conserved current there are no 
anomalous dimensions and the coefficient in [19], 
which depends on the normalization for the current 
Jua» is chosen so that [O;, O(0)] = —t,O0(0) with Q, 
the charge formed from J,a. For the energy- 
momentum tensor the operator there is an analo- 
gous result. We consider the simpler case of a 


conformal theory when the energy-momentum 
tensor is both conserved and traceless and 


T,,(x)O(0) ~ Ay, (x)O(0) 
+ Byyy(x)O*O(0) +--- [20] 
where A,,,,(x) = O(x~¢) and Bu x)-— O(x-4*). As a 
distribution A,,(x) is ambiguous up to terms 
proportional to 64(x). If A is the scale dimension 


of O and s,, are the Lorentz spin generators acting 
on O the Ward identities then give 


A 
O Aut) = (4 wrx + nis + tsn) 54 (x) 


A pu (x) = Cout (x) 
OV B vA (x) m muro" (x) 


where C,, is a constant tensor reflecting the 
arbitrariness in A,,,, it is immaterial as far as Ward 
identities are concerned. We may choose 


i21] 


A 
PAL T Ca = 0 [22] 


(If desired, we might also take A’, (x)= 
(1/2) Sub! (x) in which case 9 uA Ax x)=, Am) = 
(1/2)s,64 (x) but such an antisymmetric piece seems 
unnatural). In general there is no unique form for 
A(x), as a consequence of the freedom of choice 
for C, in [21]. However, for a scalar field O we 
must have, for x Æ 0, 


= vc (w — i= = s 
d —158, - (n x2 ) cama 
eR l; | 
(d — 1)(d — 2)S4 " " (y2y1/24-1 


Ajw(x) + 


A, uv Ut) 
[23] 


with the overall scale determined by [21]. 

For the operator product of the current /,; with 
itself there is an additional term proportional to the 
identity operator of the form 


m) 1 
x2 / x2(4-1) 


Js] (0) ~ Cybab (thar — 2 


where the coefficient Cj, which determines the scale 
of the two-point function for /,;, is well defined 
since the normalization of the current is determined 
through the Ward identity. A similar result also 
holds for the operator product of the energy- 
momentum tensor with itself, with an overall 
coefficient Cy. In general, we may also write for 
the operator product of two scalar fields O: 
Co dA ] 


1 
O(x)O(0) Cox — CrSq d — 1 (25-0741 


x xs" T (0) [25] 


|24] 
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neglecting other contributions. The contribution of 
the energy-momentum tensor does not therefore 
introduce any new coefficient. 


Two Dimensions 


In two dimensions the OPE plays an essential role in 
the discussion of conformal field theories. For a 
Euclidean metric it is natural to use complex 
variables z and z. The energy-momentum tensor in 
this case reduces to a chiral field T(z) and its 
conjugate T(z). For the operator product with a 
chiral field ó(z) with scale dimension A, 


T(z)(0) ~ 5940) - (0 26 


and, for the operator product of T with itself, 


C 2 
UMS 2 
Here c is the Virasoro central charge, which plays a 
critical role in the discussion of two-dimensional 
conformal field theories, it is given by the two-point 
function which follows from [27], (T(z)T(0)) = 
(1/2)ez~*. 

In simple rational conformal field theories the 
operators are organized into conformal blocks by 
the infinite-dimensional extended conformal sym- 
metry in two dimensions. This allows the full 
spectrum of operators and their dimensions to be 
determined and in consequence complete results for 
the OPE to be found in many cases. 


T(z)T(0) T(0) + -T(0) [27] 


Further Remarks 


The OPE reflects the locality properties of quantum 
field theories and can be extended without difficulty 
to curved space backgrounds. For a product 
ó(x)ó(0), the separation x?^ may be replaced by a 
biscalar at x and 0 but it is necessary to include in 
the OPE contributions involving the background 
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Introduction 


Optical caustics are the bright forms created by the 
focalization, natural or artificial, of light (Figure 1). 
Special caustic points, called focuses, are produced 
by stigmatic optical systems in order to visualize 
objects. However, there are no special conditions for 


Riemann tensor as well as the operator fields present 
in flat space. There is also a generalization of the 
OPE for superfields on superspace. 

At a fundamental level although the OPE can be 
derived to all orders in perturbation theory the 
contribution of nonperturbative effects such as 
instantons to the coefficients is not entirely clear. 
Issues of associativity have yet to be fully analyzed. 

There are also important applications to the 
phemenonological analysis of QCD when assump- 
tions about the OPE and saturation of sum rules can 
lead to results for the vacuum expectation value of 
gauge-invariant operators such as FH’ F,,,. 


See also: Boundary Conformal Field Theory; Effective 
Field Theories; Quantum Chromodynamics; 
Renormalization: General Theory; Renormalization: 
Statistical Mechanics and Condensed Matter; 
Two-Dimensional Models. 
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producing usual caustics. Every congruence of rays 
always generates a caustic, more or less intricate. 
Caustics have been observed and described since a 
long time, tracing back to antiquity. The name itself 
was coined after the Greek root *kausticos" mean- 
ing burning and expressing that a high energy 
density is produced by ray focalization at a caustic 
point. Conceptually, they appeared in the literature 
as "evolutes," *envelopes," "centers of curvature," 
“focals,” etc. However, these different approaches, 
often too restricted, were unable to clarify the 
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Figure 1 Optical caustics may be produced by reflection (on window glasses) or by refraction (through the wavy surface of a 
swimming pool). Here the light source, the Sun, has some angular extension and the caustic appears somewhat blurred. 


general properties of caustics, for instance, their 
classification in generic types. This difficult question 
was solved only recently in the framework of the 
singularity theory which appeared in the second half 
of the twentieth century (Whitney 1955, Thom 
1956). Caustics are now understood as physical 
realizations of Lagrangian singularities, and they 
are often called optical singularities or optical 
catastrophes. 

The aim of this introductory article is to show in 
which sense caustics can be understood as singula- 
rities, and to present their main properties. 


The Physical Phenomenon 


Caustics are usually observed by interposing a screen 
on the ray trajectories and their trace in the screen 
forms a set of bright curves called “fold” (A;). 
Across the fold, the number of rays passing through 
a given point jumps by +2. Two fold curves may 
join at some point forming there a tip called cusp 
(A3). A simple example is provided by the nephroid 
that one sees in a cup of coffee when the light is 
reflected off the cylindrical sides. In the three- 
dimensional (3D) space, the folds form surfaces 
and the cusps form curves (Figure 2). For particular 


Cusp As 


Swallow tail A4 


Figure 2 The five generic types of caustics of the 3D space. 


Elliptic umbilic Dy 


positions of the screen, three other types of caustics 
may be observed: the swallowtail (A4), the meeting 
point of two cusp lines; the elliptic umbilic (Dj), the 
meeting point of three cusp lines; and the hyperbolic 
umbilic (Dj) where a cusp line tangentially meets a 
fold surface (Figure 2). These five caustic types are 
generic in the sense that any other type of caustic 
point is unstable and decomposes into these generic 
caustic points under small perturbations. The perfect 
focus is an example of a nongeneric caustic point, 
obtained by imposing a special symmetry. The 
natural focusing of light, as in gravitational optics, 
produces only generic caustics. A caustic point is 
then a generalized focus. The caustic surface is a 
complex surface in the 3D physical space, generally 
self-intersecting and possessing singular lines A3 
ending at singular points Ay,D,, or Dj. 

At the scale of the wavelength of the light, the 
caustics have a more complex structure. Instead of 
well-defined surfaces, lines and points, one observes 
a system of interference fringes concentrated in 
the vicinity of the geometrical caustic. Each type of 
caustic point has its own diffraction pattern (also 
called diffraction catastrophe) (Figure 3). These 
interference systems are easily produced, for 
instance, by focusing a coherent laser beam by a 
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A2 


Figure 3 Interference fringes produced by the five generic 
caustics of the 3D space (numerical simulation). 


corrugated glass or by a water droplet. An impor- 
tant feature is revealed by Gouy’s experiment, in 
which bright and dark fringes are inverted when the 
rays are forced to pass through a focus (Guillemin 
and Sternberg 1977). The experiment shows that the 
wave undergoes a phase shift of 7/2 when the 
associated ray passes through a caustic point. 

So, caustics are fundamental objects of both the 
geometrical optics and the wave optics. 


Modeling Caustics 


Because of the presence of a caustic, a congruence of 
rays generally presents intersecting rays. At the 
points of intersection, the coordinates q1,q2,q3 of 
the physical space R? are unable to distinguish the 
various intersecting rays and they do not constitute a 
convenient system of coordinates. It is then interest- 
ing to construct an abstract space in which the rays 
are represented by nonintersecting curves. The initial 
congruence is recovered by projecting the abstract 
space into the physical one. All the models use this 
type of construction in which the properties of the 
caustics are deduced from those of the projection. 


Caustics as Envelopes of Rays 


In this geometrical modeling, each ray is labeled by 
two parameters 7r;,7», for instance, the coordinates on 
the initial wave front W. A third coordinate r3 
specifies the points along the ray, for instance, 
by assigning their distance to W. Taken together, 
these three coordinates represent the congruence of 
rays, and define a 3D space, the source space 
M = {r1,r2,r3}. By construction, the rays in M do 
not intersect. The coordinates (q1,q2,q3) of the 
current point PeR? along each ray depend 


differentiably on the coordinates (ri, r2, r3) and define 
a “projection” f :(ri,72,73) — (qd1,42, 43) from the 
source space M into the physical space RÌ. 

The caustic points correspond to the envelope of 
the rays. At a caustic point P, the energy density 
flowing along the rays becomes infinite, since the 
small volume delimited by neighboring rays is 
shrunk into a small surface at P. This behavior 
may be simply expressed with the help of the 
projection f: the rank rk of the derivative Df is 
equal to 2 at the point representing P in M. This 
motivates the following definition. Given a map 
f : M — N, a point x € M is said to be critical (or 
singular) if the rank of the derivative Df is less than 
the maximal possible value min(dim M, dim N). 
Here, dim M — dim N —3, and a critical point is a 
point where rk < 3. The set X C M of the critical 
points is called the singular set. The caustic C is the 
image of the singular set: C — (X). One also says 
that the caustic points are the critical values of f. 

In practice, the derivative Df is expressed by the 
Jacobian matrix J = 0(q1, q2, 43) /O(ri, 72,73) and the 
singular set X is defined by solving the equation 


det(J) = 0 [1] 


If this equation permits one to express explicitly one 
coordinate, say 73, as a function of the other two, 
the caustic surface C is found in parametric form: 
qı = 41("1572,73(71,72)), etc. For a homogeneous 
medium, equation |1] is of second degree in r3 and 
the caustic is composed of two sheets which meet at 
the umbilic points D4. 

Equation [1] gives all caustic points independently 
of their nature, that is, it does not distinguish 
between A», A3, A4, Dł, and Dj. A refinement 
allows one to recognize different types of caustic 
points. One defines the Thom-Boardman class Y/ as 
the points in M where Df has a kernel of dimension 
i. Then one defines inductively the class X^^»^^ as 
the class X^ of the restriction of f to So’. Thus, X? 
represents the regular points (noncaustic points), 
X^" the fold points A2, X^'? the cusp points 
A3, X^ ^9 the swallow-tail points A4, and X7? the 
umbilics D4 (hyperbolic or elliptic). Altogether, the 
classes ©’, I 4 0, form the singular set X. 

The Thom-Boardman classes constitute a simple 
and powerful tool for computing the structure of a 
caustic. Each class is obtained by canceling some 
functional determinants associated with the map f or 
with its restriction to some class. However, the 
method presents the weakness of ignoring the 
special nature of a set of rays: its Lagrangian 
character. As a consequence, it is unable, for 
instance, to distinguish between D} and Dj. 


Caustics as Lagrangian Singularities 


As for mechanics, the natural framework for geomet- 
rical optics is a phase space: the cotangent space 
T* R? = (pi, qi] of the configuration space R? = (qj). 
The phase space is characterized by its symplectic 
structure, that is, the differential 2-form w= 7; dp; ^ 
dq;, which is nondegenerate and closed (dw = 0). 

A set of rays in the phase space is defined by 
specifying the wave vector (or momentum) p at 
each point q of the congruence. In the simple case 
where only one ray passes through each point, one 
has p — VS, where S is the optical length f 7ds and 
n the refractive index. In other words, p is the 
differential of the optical length. The wave vector 
p is tangent to the ray and orthogonal to the 
(geometrical) wave front S=const. The eikonal 
equation shows that its modulus is n. As a direct 
consequence of the relation p — VS, the symplectic 
form annihilates identically for these p. However, 
in general, because of the presence of the caustics, 
one must not expect to have p=VS for some 
function $. Nevertheless, it is possible to keep 
the more general property to annihilate w. This 
motivates the definition of a Lagrangian submani- 
fold: a submanifold Lc T*R? of dimension 3 
(that is, half of the dimension of the phase space) 
on which the symplectic form vanishes: w|; — 0. 
Every congruence of rays is described by a 
Lagrangian submanifold. The Lagrangian subma- 
nifold plays the same role as the source space in 
the preceding section. The role of the projection f 
is played by the natural projection m from the 
phase space into the configuration space 
"T(D,q)-— 4, or more precisely to its restriction to 
L: f —7|,. It is called a Lagrangian map (or 
Lagrangian projection) and it is again a map 
between two spaces of the same dimension (here 
3). When L is given by an embedding 1: L > T*R?, 
one has f — 7.4. A caustic is then defined as the 
set of critical values of a Lagrangian map. 

There exist two remarkable results showing that a 
Lagrangian submanifold may be described in terms 
of functions or of families of functions. As a 
consequence, caustics are not directly related to the 
singularities of maps but, more particularly, to the 
singularities of functions. 


Generating function of a Lagrangian submanifold 
The 3D Lagrangian submanifold L C (p;,q;] is 
locally defined by three coordinates pala € A) and 
qa(8 € B) depending on the three other ones p; and 
Ja? Do = Palda: P8), 13 =98(das Da). One can show 
that this may be done in such a way that each 
conjugate pair (q;, pi) gives exactly one independent 
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variable and one dependent variable. Formally: 
AUB={1,2, 3}, ANB=6. 

In fact, introducing the function S(g,,p3)= 
| (p,dq) — (qa.pa)((,) denotes the scalar product), 
the local equation for L takes a more simple form: 


_ a | _ as 
48 i Op 3 pa i Oda 


The function $ is well defined, since, by the 
definition of a Lagrangian submanifold [(p,dq) is 
locally path independent: it depends only on its end 
points. $ is called a (local) generating function. 
Formula [2] generalizes p=VS, to which it 
reduces when B — (), that is, for nonintersecting rays. 


H 


Generating family and optical catastrophes 
Formula [2] may be rewritten in an interesting 
way. Taking the |B| variables p; as internal 
parameters x and q=(qa,q3) as external para- 
meters, we construct a function F of x parametrized 
by q: F(x,q)=S(qa,x) + (q3,x). Now the Lagran- 
gian submanifold L is defined by 
OF OF 
Le fa): x: az Ü, p= xl 

F is called the generating family. The first equation 
OF/Ox =0 determines the rays passing through the 
fixed external parameter q € R?. The second one 
distinguishes these rays according to their wave 
vector p. Each ray corresponds to a critical point 
(i.e., an extremum) of F considered as a function of 
x. At a caustic point, two infinitely close rays are 
converging and F then presents a degenerate critical 
point. So the generating-family technique links the 
caustics to the theory of singularities of functions 
depending on some parameters, that is, to the 
catastrophe theory (Thom 1969). Caustics are also 
called optical catastrophes. 

The generating families are not uniquely defined, 
even locally. In optics, one may always take for F 
the equivalent family *optical length" d, considered 
as a function defined on the initial wave front W 
(this is discussed in the following). 


Caustics as the Locus of Wave Front Singularities 


There exists a remarkable duality linking rays and 
wave fronts. As a consequence, the caustic points 
(i.e., Lagrangian singularities) are related to singula- 
rities of wave fronts (i.e., Legendrian singularities). A 
typical wave front W may possess only two types of 
singularities: cuspidal curves and swallow-tail points. 
During the motion of W, governed by the eikonal 
equation, the cuspidal curves generate surfaces, and 
swallow tails generate curves. These surfaces are 
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exactly the fold surfaces of the caustic C and the 
curves are the cusp lines of C. The point singularities 
of the caustic, that is, the swallow tails and the 
umbilics, correspond to bifurcations of the instanta- 
neous wave front, at certain moments of its motion. 


Caustics as Short Wave Asymptotic 


The fine observation of the optical caustics shows 
that they never appear as the well-defined surfaces 
given by the geometrical optics, but rather as 
diffraction patterns concentrated around these sur- 
faces. So wave optics is the natural framework 
to account for this fundamental feature. One 
exploits the fact that the wave number k —2z/A 
(A: wavelength of the light) is a large parameter. 
This short-wave approximation permits the use of 
powerful expansion techniques and clarifies the 
relation with the geometrical optics viewpoint, 
formally obtained for & tending to infinity. 


The stationary phase In the most simple model, 
the Huygens-Fresnel principle, the amplitude U(P) 
of the optical field may be evaluated by adding the 
secondary disturbances emitted from the points O of 
some initial wave front W: 


U(P) — ff > G ds [3] 


where d is the distance OP. G is the inclination factor, 
a smooth function defined on W and c some 
prefactor. For simplicity, G and n (the refractive 
index) are assumed to be constant. Defining 
a — cG/d, formula [3] appears as an integral of the 
form [a(y)e ^?» dy. This type of integral may be 
evaluated for large k by the method of stationary 
phase. The principal contributions are due to points 
where the phase ¢ is stationary: V@=0. For wave 
Optics, is the length PO, considered as a function of 
O and parametrized by P. The stationary condition 
means that PO is normal to W, that is, it represents a 
ray of geometrical optics. The function PO is a 
generating family in the sense of the discussion earlier. 

If no stationary points exist, that is, if P is in the 
shadow, the integral is O(k™) for any N. Other- 
wise, and if the critical points are not degenerate, 
the phase stationary method gives (Guillemin and 
Sternberg 1977): 


U(P) == 


el 1—{)m/2 
rays PO 
a(O)e'*4 


eet 007 B 
I(1 — uid)(1 — uad)| "^ (k^) [4] 


where 44! and p3' are the two principal radii of 
curvature at O € W, and ¢ the number of caustic 
points (also called focal points) along the ray PO. 

In the stationary-phase approach, the caustic C, 
locus of centers of curvature of W, appears as an 
obstacle in constructing asymptotics, since formula [4] 
diverges when du; — 1, that is, when P tends to C. 
It is, nevertheless, remarkable that C also appears 
explicitly when [4] is valid, via the ups and 4. In 
particular, the term e ?"/?, applied in the case of a 
focus (1 = 2), accounts for the phase shift of m observed 
in Gouy's experiment. 


Asymptotics on caustics Uniform asymptotic for- 
mulas, valid also on the caustic, need a more complex 
theoretical framework, for instance, Maslov's theory, 
presented here in a necessarily simplified version (see 
Maslov and Fedoriuk (1981) for more detail). 

The starting point is the equation of wave optics, 
that is, the Helmholtz equation 


(A + k’n?)U = 0 [5] 


where the refractive index n generally varies from 
point to point. For k — oc, one looks for an 
asymptotic solution in the (tentatively) form: 


U(P) = e56n.4245 X (ik) "ej(q1. 2.43) [3 
0 


Inserting this form in eqn [5] one obtains the eikonal 
equation (or characteristic equation) for the phase S: 


(VSF = 


and an infinite series of equations for the amplitudes 
yj, called the transport equations. One knows that 
the Cauchy problem for the eikonal equation may be 
reduced to the integration of the corresponding 
Cauchy problem for the Hamilton system (or 
bicharacteristic system): 
dq OH | dp OH 
dt op” dt dq 

where H= (p,p) — 1. Its solutions, the bicharac- 
teristics q(t,£), p(t,£) are parametrized by the 
“time” t and the 2D parameter € parametrizing the 
points on the initial wave front W. The bicharacter- 
istics form a 3D Lagrangian submanifold L in the 
phase space [p;,q;] and one recovers the preceding 
situation. Assuming L to be simply connected, one 
defines a global phase function § on L by formula 
S(t, £) = J P, dq). 

In a domain 2; C L not containing the singular set 
and in which the coordinates £,£ are in a one-to-one 
correspondence with the physical coordinates, S$ 


becomes a function of q;. Using the transport equation, 
one finds the leading term of the asymptotic solution 
(with accuracy to k^!) in the following form: 


4— [e An 4243) (44. q5, qx) [7] 


where do and dq; respectively, represent the 
measures on the Lagrangian submanifold and on 
the physical space. The amplitude y depends on the 
initial conditions. Formula [7] defines a precanoni- 
cal operator K(Q;). It has the same form as [4], with 
the same drawback to diverge near the singular set 
3, where dq; = 0. 

In a domain Q; containing the singular set, L is 
locally parametrized by mixed coordinates g,, pj. 
The basic idea is then, roughly speaking, to carry 
out a Fourier transform F, with respect to these p; 
(in fact, a variant of the usual Fourier transform, in 
which the parameter k appears in the prefactor and 
in the phase term). This leads one to consider, 
instead of L=A + k?n?, the operator L=F,LF,", 
and instead of U, the unknown function V = F,U. In 
this Fourier space, V may be found in the same way 
as U was found in the preceding case, with $ 
replaced here by the local generating function 
Si(dasP3) =S — (qa, ps). Coming back to the real 
space by F,', one obtains (with the same accuracy): 


U(P) = (K(0j))(q) 


do 
_ rp-1 
ws | d 


There is no divergence in this local solution. So local 
short-wave asymptotics may be found everywhere, 
even on the caustic where they have a more complex 
form than the form [6] or [7]. 


gium "ee [8] 


Global asymptotics and Maslov's index The global 
asymptotic solution is obtained by formally gluing 
the local solutions by a partition of unity Xe;— 1 
subordinate to a covering {Q;} of L. However there 
is a difficulty. The representations of the same 
precanonical operator in different local coordinates 
da, P3, even not containing the singular set, agree 
only up to a constant multiplier e'””/*, where the 
integer m is the number of negative eigenvalues of 
some matrix. One is led to multiply every precano- 
nical operator by a convenient phase factor e ^'7?/?, 
where y € Z4 is called Maslov's index. The coher- 
ency of the phase factor in different domains is 
realized by using the important property of X to be 
co-oriented. Thus, y counts the number of passages 
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of an oriented path on L from the negative side of X 
to its positive side, minus the number of passages in 
the opposite sense. Maslov's index is locally con- 
stant and jumps by +1 only across the singular set 
X. The global canonical operator is now formally 
defined as K = Xje "»/ K(Q,)e;. 

Finally, the canonical operator K is well defined 
only if it is independent of the {{2;} and e; used for its 
definition. This possibility is expressed (in the case of a 
simply connected L) by the following property, 
intrinsically attached to L: the Maslov index cancels 
on every closed loop. So the only obstruction for global 
asymptotics is the nontriviality of the characteristic 
class defined by Maslov's index and not the caustic. 

The central object of the caustic modeling is then 
the projection of the submanifold representing the 
rays (M or L) into the physical space. The possibility 
to reduce this projection to some normal form is the 
key result for the local classification of caustics. 


Local Classification of Caustics 
Equivalence, Stability, and Genericity 


In order to distinguish different types of singula- 
rities, one has to define an equivalence relation. Two 
Lagrangian maps f;: T* M; > L; — Mj (1— 1,2), are 
said to be Lagrange equivalent if there is a 
diffeomorphism b: T*M, — T*M; preserving both 
the symplectic and the fiber structures, and sending 
Lı to L. In fact, only the local problem of 
classification makes sense, and one considers, 
instead of Lagrangian maps, germs of Lagrangian 
maps. À map germ is a map locally defined, that is, 
defined in an infinitely small neighborhood around a 
point (depending on the germ). The notion of 
Lagrange equivalence is extended to the germs. A 
Lagrangian singularity is then the Lagrange equiva- 
lence class of a germ at a critical point. Each 
equivalence class represents a type of Lagrangian 
singularity, that is, a type of caustic point. 

The example of the perfect focus point shows that 
there exist singularities which are totally unstable. In 
this sense, they correspond to idealized situations not 
physically realizable, and they have to be disregarded. 
Conversely, stable singularities resist under the action 
of small perturbations. They correspond to Lagrangian 
germs for which all neighboring germs are Lagrange 
equivalent (not necessarily at the same point, but near 
the point considered). 

Now the important question is: do the stable germs 
represent the generality? In the best case, stable germs 
form a dense open set. This means that every germ may 
be approximated by stable germs. In this case, one says 
that the stable germs are generic. 
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Stability and genericity are disctinct notions. It 
turns out that they coincide for low values of the 
dimension z of the “physical space" (n < 6), but 
they may disagree at higher dimensions. 


Classification of Stable Caustics 


The fundamental result of the theory is the local 
classification of Lagrangian singularities (Arnol'd 
1972). With the help of the generating families, the 
study of Lagrangian singularities is reduced to the 
study of singularities of families of functions. More 
precisely, at a singular point, every stable Lagragian 
map is equivalent to one of the following maps, 
given by their generating function $ and by their 
generating family F: 


Asi p= p? 
F=% +qx 
A3: S=4p + qp 
F = +x* + qix? + qox 
A4: S-—pi-qabi - qsbi 
F — x? + qu) + qax? + qax 
Di: S$S—pictpip5 * qspi 
r= x7 x) + x3 "P qix$ + d33X»5. F d3X1 


These polynomial functions are called normal forms. 
The stable singularities are generic. In other words, 
every other type of singularity is destroyed by 
infinitely small perturbations and gives a set of 
singularities belonging to the list. The five generic 
caustics have been observed and experimentally 
studied in detail (Berry and Upstill 1980, Nye 1999). 

By inserting the normal forms S in a short-wave 
asymptotic, one obtains the diffraction patterns 
associated with the five caustic types (Figure 3). 
They generalize the Airy function which corresponds 
to the fold singularity. 

The normal forms describe at once the geometry 
of the caustics and the interference systems around 
them. 


Codimension, Corank, Multiplicity, and Index 


Lagrangian singularities are also characterized by 
some numbers. They have a codimension c equal 
to the difference between the dimension of the 
physical space and their dimension: c(A2)=1, 
c(A3)=2, c(A4) 2 c(D3) =3. They have a corank ck, 
equal to the difference between the dimension of the 
space and the rank of the Lagrangian map: 
ck(A2) =ck(A3) =ck(Aq4) = 1, ck(D7) =2. The corank 
is the number of internal parameters of the generating 
family F. They also have a multiplicity jj, which is the 


number of nondegenerate critical points of F, that is, 
the number of rays coinciding at the singularity. In 
the 3D space, one has u — c + 1: n(A2) 22, u(A3) —3, 
p(A4) = (D) —4. 

Short-wave asymptotics near the caustic present 
remarkable scaling properties (Berry and Upstill 
1980). In particular, the amplitude |U(P)| increases 
like k? as k — oc. The number 6 depends only on the 
type of the singularity and it is called the singularity 
index. The more “degenerate” the singularities, the 
larger the index, and then the brighter the caustic 
point: 6(A2)=1/6 < 6(A3)=1/4 < 6(A4) =3/10 < 
SD7 j= 1/3. 


Global Organization of Caustics 


The global properties of caustics are less under- 
stood than the local ones. There is, nevertheless, 
an interesting result concerning specifically the 
caustics in the 3D space (Chekanov 1986). Given 
a Lagrangian map f:L — R?, the Euler character- 
istic x(X) of the singular set X C L and the number 
1D4(—1/2) of umbilics of index —1/2 are related 
by the formula 


XD) + 2£D4(—1/2) = 0 [9] 


At an umbilic point T, X is locally a cone with 
vertex at T. The index is defined according to the 
relative positions of the following elements: the 2D 
plane II—kerf, the cusp lines A3 C X passing 
through T, and the characteristic line / which 
represents the ray at T. If / and A3 are separated 
by II, the index is equal to -- 1/2, and to —1/2 in the 
other case. The index of an elliptic umbilic is always 
equal to —1/2. 

The validity of Chekanov's formula [9] requires 
that L lies on a hypersurface E of the phase space, 
convex with respect to the wave vectors. The 
characteristics are the orthocomplements of E. In 
this framework, the singularities are called optical 
singularities, because such an E is always defined in 
geometrical optics by the eikonal equation. All 
Lagrangian singularities can be realized as optical 
singularities. Chekanov's formula has been experi- 
mentally checked (Joets and Ribotta 1996). 

The Chekanov relation has an important conse- 
quence on the caustic bifurcations (also called 
metamorphoses or perestroikas), that is, the generic 
transformations modifying the topology of a caustic 
depending on one parameter. Among the 11 possible 
caustic bifurcations, considered as bifurcations of 
general Lagrangian singularities, four of them cannot 
be realized as bifurcations of optical Lagrangian 
singularities. So Chekanov’s relation reduces the 
number of optical metamorphoses to seven. 


Extensions 
Caustics in Spaces of Higher Dimension 


The local classification of Lagrangian singularities 
has been extended in spaces of higher dimension. 
For n=4, in addition to the preceding ones, two 
new singularities appear: the butterfly As and the 
parabolic umbilic Ds. For n= 5, in addition to Aq 
and DzZ, one has a new type of umbilic: Eg. 
However, in higher dimensions, the classification 
becomes more complex. In addition to stable 
singularities, like those of the series Aj, D;, Ej, one 
encounters unstable generic singularities which 
depend on arbitrary parameters (moduli). Despite 
this difficulty, there exists a classification of generic 
Lagrangian singularities up to the dimension n= 10. 

The Maslov index has been extended in spaces of 
higher dimension and has led to the discovery of 
invariants associated with particular types of singu- 
larities (Vassilyev 1988). These invariants control 
the number of some types of singularities. For 
instance, in dimension z—4, the number of A; 
(taking account of sign) is equal to zero. 


Symmetrical Caustics 


Another extension consists in imposing some 
constraint, for instance, a symmetry (Janeczko and 
Roberts 1993). Symmetrical caustics are not merely 
the symmetrized usual caustics. Many of them result 
from the stabilization of unstable singularities of 
higher codimension by the symmetry. For example, 
in the 3D space, the butterfly As is unstable, but the 
symmetrical butterfly is a generic singularity in the 
class of Lagrangian singularities having the mirror 
symmetry. 


Nonoptical Caustics 


Caustics, as locus of focalization, are not restricted 
to the usual optics. They are also observed in 
electronic optics or in gravitational optics and the 
preceding results apply to these waves. They also 
appear in nonelectromagnetic waves, for instance, 
acoustic waves, seismic waves, etc. Propagation 
always generates caustics. 

Optical caustics are now understood as Lagran- 
gian singularities and, as singularities, their interest 
is not restricted to optics. They became indispen- 
sable for understanding other domains of mathema- 
tical physics, for instance, the variational calculus, 
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the classical mechanics, the Hamilton—Jacobi equa- 
tions, the control theory, the field theory, etc. 


See also: Billiards in Bounded Convex Domains; Normal 
Forms and Semiclassical Approximation; Stationary 
Phase Approximation; Singularity and Bifurcation Theory. 
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Introduction 


According to the well-known *no-cloning theorem" 
(Wootters and Zurek 1982) perfect copying of 
quantum information is impossible, that is, there is 
no machine which takes a quantum system in an 
unknown state as input and produces two systems of 
the same kind, such that none of them is distinguish- 
able from the input by a statistical experiment. In 
this qualitative form, however, the theorem is not 
very useful, because in the presence of noise classical 
information cannot be copied perfectly as well. 
Therefore, the crucial point is that even under ideal 
conditions the errors produced in the clones cannot 
be made arbitrarily small. The best we can hope for 
is to find an optimal cloning device which makes 
these errors as small as possible. 

More generally, we can consider cloning devices, 
which take as input a certain number, N, of 
identically prepared systems, and produce a larger 
number, M, of systems as output. Again, the 
cloning task is to make the output state resemble 
as much as possible a state of M systems all 
prepared in the same state as the inputs. This 
variant of the problem is of interest as a “quantum 
amplifier." It also has a better chance of reasonable 
success than a cloning device operating on single- 
input systems: in the limit of many-input systems, 
the device can make a good statistical estimate of 
the input density matrix and hence produce 
arbitrarily good clones. 


Figures of Merit 


To get a precise mathematical description of the 
problem, let us consider a one-particle Hilbert 
space H (which is assumed to be finite dimen- 
sional, ^1 — C, if nothing else is explicitly stated) 
and the algebras B(H®%),B(H°™) of (bounded) 
operators on the N-fold, respectively M-fold, 
tensor product of H. A quantum operation which 
takes N particles as input and produces M output 
particles is then described, in the Hleisenberg 
picture, by a completely positive, unital map (a 
completely positive, unital and normal map if H is 
infinite dimensional): 


T : B(H9") — B(H9") [1] 


while the Schrödinger picture representation is given 
in terms of the (pre-)dual of T, that is, 


T, : BAET — BUA [2] 


where B,(-) denotes the space of trace-class 
operators. Hence, if T operates on input systems in 
the (joint) state p?*, the output systems (i.e., the 
*clones") are in the state T,(p°%). We will call each 
such T a cloning map. 

Now our aim is to find an operation T such that 
the output state T,(p®’) approximates the product 
state p®™M as well as possible. The quality of the 
approximation is measured by a distance function ó 
on the convex set S(H®™) c B,(H*M) of density 
operators on H®™ and, since it is impossible to 
minimize ó(T,(p*N), p°™) for all p simultaneously, 
we are looking only for the worst case. Hence, the 
quality of a cloning map T is measured by a figure 
of merit of the form 


Axs(T)- supé(T.(p9 ), o9") ^ — [3| 
pex 
Here X C S(H) is a set of “preferred” density 
operators whose role will be explained in the next 
section. An optimal cloning device is described by a 
cloning map T which minimizes Ay ;, that is, 


Axs(T) € Axs(T) [4] 


should hold for each cloning map T. 


The Preferred Set of States 


The set X C S(H) of density operators introduced in 
the last equation describe a priori knowledge about 
the one-particle input state p; for example, if we 
want to clone only signal states pj,...,p, used to 
transmit classical information through a quantum 
channel, the choice for X is {p1,..., pg}. Other 
possibilities include: X — S(H) if nothing is known 
about p, the set of pure states, the states in the 
*equatorial plane" of the Bloch sphere, or Gaussian 
states if H is infinite dimensional. Each different 
choice for X leads to a different variant of the 
cloning problem, and we will summarize the most 
relevant cases treated in the literature in the section 
“Examples.” 

A different kind of a priori knowledge is a priori 
measures, that is, instead of knowing that all 
possible input states lie in a special set X, we know 
for each measurable set X C S(H) the probability 
u(X) for p € X. Such a situation typically arises 
when we are trying to clone states of systems which 


originate from a source with known characteristics. 
In this case, we can use mean errors, 


Am) = f(T), o™)ulde) — (5 


as a figure of merit. Sometimes these are easier to 
compute than maximal errors as in eqn [3]. Often, 
however, A leads to stronger results than A, 
therefore we will concentrate our discussion on 
maximal rather than mean errors. 


The Distance Measure 


The remaining freedom in eqn [3] is the distance 
measure ó and there are mainly two physically 
different choices: we can either check the quality of 
each clone separately or we can test, in addition, the 
correlations between output systems. The most 
common choice for a figure of merit for the first 
type is given by (where tr; denotes partial trace over 
all but the jth tensor factor) 


A(T) = sup|1 — F (trjT.(9*^), p)| [6] 
pe X 
Here F(p, a) denotes the (quadratic) fidelity of p and 
c, that is, 


(p, c) -w((p 2592) m 


and the supremum is taken over all p € X and 
j=1,...,N. A4 measures the worst one-particle 
error of the output state T*(o®%), and we will refer 
to it in the following as the local error. If we are 
interested in correlations too, we have to choose 


A(T) = sup|1 -FL B 
pe 


Aj measures again a “worst-case” error, but now 
of the full output with respect to M uncorrelated 
copies of the input p. We will call it the global error. 
Alternative figures of merit arise if we replace the 
fidelity in eqns [6] and [8] by other distance 
measures like the trace norm, the Hilbert-Schmidt 
norm, or the relative entropy. If X consists only of 
pure states, the operations T which minimize Aj or 
Aj are usually not altered by such different choices. 
If X is a set of mixed states, however, the correct 
choice is unclear and might depend on the precise 
physical context (there is, in particular, no reason to 
prefer fidelities). 


General Properties 


Before we consider more special examples in the 
next section, let us discuss some general properties 
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of the figure of merit Ay s from eqn [3] and the 
corresponding optimization problem. 


Existence of Solutions 


If the distance measure ó is continuous in the first 
argument, the optimization problem [4] has a 
solution, that is, optimal cloning machines exist: 
the set 7 of cloning maps [1] is compact and the 
quantity Ay 5 is — as a supremum over continuous 
functions — lower-semicontinuous. Hence, the 
statement follows from the fact that a lower- 
semicontinuous function on a compact set always 
admits a minimizer. 

This argument can be generalized to the infinite- 
dimensional case, if we choose the set 7 of allowed 
cloning maps more carefully (the restriction to 
normal channels proposed above is most probably 
not sufficient for this purpose) and if we equip it 
with an appropriate topology. The latter should be 
weak enough for 7 to be compact, and strong 
enough for Ax; to be lower-semicontinuous. A 
typical choice is the weak*-topology arising from an 
embedding of 7 into the dual of a Banach space 
(such that we can apply the Banach-Alaoglu 
Theorem). Detailed studies in this direction are, 
however, not yet available. 


Covariant Cloning Maps 


To solve the optimization problem [4] is a difficult 
and, in many cases, impossible task. However, it can 
be simplified significantly if X and ó admit a 
nontrivial symmetry group. Hence, consider again 
a distance 6 which is continuous and convex in its 
first argument and a closed subgroup G of the group 
U(d) of unitary operators on H — C4, such that 


UXU* C X, sf y reve Vir ii 

= ó(p, c) [9] 

hold for all U € G and p,c € S(H®™). Then Ax; is 

invariant under the induced G action on the set 7 of 
cloning maps, that is, 
Axs(TruT) = Axs(T) 

with (ryT)(A) = U*"T(U*M*AysMyysN* m0) 

holds for all U € G and all T € 7. Convexity of 

Ax s in T implies (with the Haar measure uy on G) 


Axs(T) € Axs(T), 
with T = J ru(T)un(dU) [11] 
G 


for all T. Hence, we can replace each cloning map 
by its group average T without sacrificing the 
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quality of the clones. This implies that T is optimal 
if T is, and, since T is G-covariant, 


VUEG [12] 


ru(T)=T 
we can conclude, together with the arguments from the 
last section, that the optimization problem [4] always 
admits covariant solutions. Similarly, we can show that 
permutation invariant (sometimes called “symmetric” ) 
solutions exist, that is, cloners which do not prefer a 
particular clone or a particular input system. 

This is a very useful result, because the set of 
covariant and permutation-invariant T is much 
smaller than the set of all cloning maps, and it can 
be parametrized in terms of irreducible representa- 
tions of G and the permutation group. In particular, 
the case G=U(d) (such a T is often called 
“universal” because it does not prefer any direction 
in the Hilbert space H) leads to quite general 
solutions. 


Relationships with Quantum State Estimation 


If a procedure to estimate the input state p from a 
measurement on the N-fold system in the joint state 
p? is given, there is a simple way to produce a 
cloning machine: we just have to take the estimate 
p for the density matrix p and prepare M>N 
systems in the state 9?M", If X is finite and 
estimation (which in this case is called hypothesis 
testing) is done in terms of a positive operator 
valued measure (E,),-x, E, € B(H®), the prob- 
ability to get the estimate c € X when the input is 
in the state p?N is given by tr(E,p?). Hence, the 
cloning map derived from this estimation scheme is 
given by 


~ 


E, Lm - > wE om [13] 


oEX 


A generalization to arbitrary X is straightforward, 
but requires the use of measure theory. It is easy to 
see that the cloning map E from eqn [13] is in 
general not optimal, in particular if M is only 
slightly bigger than N. However, E has the interest- 
ing feature that Ax ;(E) depends only on the number 
of input systems, N, but not on the number of 
clones, M, we want to produce. This observation 
leads immediately to the conjecture that E becomes 
optimal in the limit M — oo. A general proof is 
currently not available, in those cases, however, 
where optimal cloner and estimater can be explicitly 
calculated for all N and M (i.e., the cases treated in 
the sections "Universal pure-state cloning" and 
“Phase-covariant pure-state cloning") the conjecture 
is true. A more detailed discussion of this problem 
together with information about its current status 


can be found on the web at http://www.imaph. 
tu-bs.de/qi/problems/problems-html. 


Examples 


In this section, we will discuss concrete examples 
that arise from different choices of the distance 
measure 6 and the set X of preferred states. 


Universal Pure-State Cloning 


The most frequently discussed case arises if X is the 
set of pure states, that is, the input states are pure, 
but otherwise unknown. Under this condition, it is 
sufficient to consider the symmetric part (7^ of the 
tensor product H®™, and only cloning maps 
T:B(H9^) = BE" |, because only this part 
affects the local or the global error. A complete 
solution for arbitrary N, M and all finite- 
dimensional Hilbert spaces is available for A, in 
Werner (1998) and for A; in Keyl and Werner 
(1999). Both cases admit the same (surprisingly 
simple) unique solution 


~  . dN " 
Tu) = iui 5v c & (ee Sy [14] 


where Sy is the projection onto the symmetric 
tensor product 58M and d[M] denotes the dimen- 
sion of (M. To derive these results, the group- 
theoretic methods sketched in the section “Covar- 
iant cloning maps” are used. The fact that global 
and local figures of merit are minimized by the same 
cloning map is surprising and a special feature of 
pure-state cloning. It implies that correlations and 
entanglement between the clones does not matter 
at all. 


Phase-Covariant Pure-State Cloning 


Consider a fixed basis |j}, 7 — 0,..., d — 1, in H and 


let X be the set of states given by 
dal 
V — |0) - Y e^lj [15 
j=] 


where the ¢; denote arbitrary phases. Obviously, 
this set is invariant under the set of all unitaries 
which are diagonal in the given basis (ie., a 
maximal torus in U(d)). Using the methods outlined 
in the section “Covariant cloning maps," the 
corresponding cloning problem is (almost) comple- 
tely solved in Buscemi et al. (2005). For arbitrary 
d=dimH,N and all M=N-+dk, with REN a 


cloning map which minimizes global as well as 
local errors is given in terms of the unitary 


U : HSN 5 MEM, Ü|no, . .., 4) 
= [ng + b,..., na +k) [16] 


where |n1,... na) n; € IN, denotes the number 
basis of H® associated with the distinguished 
basis |j) of H. 


Cloning Finitely Many States 


If X is a finite set of pure states, a general solution 
is not available, but there are several important 
partial results. The easiest situation arises if the 
elements of X are mutually orthogonal pure states. 
In this case, ideal cloning is possible in terms of an 
appropriately chosen unitary. If the states are 
linearly independent but nonorthogonal, ideal 
cloning is possible as well if we consider probabil- 
istic cloning machines (Duan and Guo 1998); that 
is, there is a nonvanishing probability that the 
machine fails and does not produce any clones at 
all (this means T is not unital). Optimal cloning 
(with deterministic operations) of two nonorthog- 
onal qubit states p; — |v;) (j|, j — 1, 2, is considered 
for all N, M in (Bruf et al. (1998) and Chefles and 
Barnett (1999)) (using averaged global fidelity as 
the figure of merit). The crucial observation in this 
case is that the optimal clones are pure, that is, 
T, (o? ") = wj (V and that the W; lie in the 
subspace spanned by the (unattainable) ideal 
clones pen 


Universal Mixed-State Cloning 


X =S(H) means that absolutely nothing is known a 
priori about the input state p. If the distance 
measure 6 is U(d) and permutation invariant 
(which is the case for all possible choices discussed 
in the section “The distance measure”) the analysis 
from the section “Covariant cloning maps” shows 
that a universal and symmetric minimizer exists. An 
explicit solution, however, is not known, and even 
the physically most appropriate choice for 6 is 
unclear. In contrast to the pure-state case, this is a 
serious question, because the set of optimal cloners 
is, in this case, much more sensitive to changes in 6. 
In particular, correlations among the clones become 
crucial, and it is very likely that local and global 
figures of merit lead to very different solutions. To 
emphasize this difference, an operation which 
minimizes only local errors is sometimes called 
"broadcasting," rather than cloning. A related 
problem with (at least) partial solutions (“purifica- 
tion") will be discussed in the section “Purification.” 
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Cloning of Gaussian States 


If the Hilbert space is infinite dimensional, the restric- 
tion to a reasonable small set X of preferred states is 
crucial, because otherwise the search for minimizers 
becomes hopeless. A physically relevant class with nice 
mathematical properties are Gaussian states and in 
particular coherent states. Cloning of the latter has been 
studied in Cerf et al. (2005) for the case N — 1 (and M 
arbitrary). As in the section *Covariant cloning maps," 
it can be shown that the search for optimal cloners can 
be restricted to those which are covariant with respect 
to phase space translations. This simplifies the problem 
significantly and leads to the result that the global error 
is minimized by Gaussian cloning maps, while in the 
local case the best cloner is non-Gaussian. 


Asymmetric Cloning 


In all examples discussed up to now, we have 
considered symmetric cloners, that is, the quality of 
all clones is measured with equal weight. Alternatively, 
we can look for asymmetric cloners which produce 
clones with different quality and ask for the trade-off 
between them. This problem was first discussed in Cerf 
(2000) and later in Iblisdir et al. (2005). It can be 
regarded as a constraint optimization problem, where 
the error of the first M'« M clones should be 
minimized under the constraint that the error of the 
rest is bounded by a fixed value. In Iblisdir et al. (2005), 
it is conjectured that for pure input states and local 
errors the optimal solution to this problem is given by 


T,(c) = V* (o & pear hy 17] 


where V is a linear combination of projections in the 
commutant of {U®’ | U e U(H)}. This conjecture is 
true (at least) for qubits in the case 1—n+ 1 and 
114 n. 


Related Problems 


Instead of cloning, we can also try to approximate 
other impossible machines by channels which 
operate on multiple inputs. To this end, we only 
have to replace the figure of merit [6] by 


A1a(T) = sup |1 — F(T.) &(9)| — [18] 
peX j 

where 8:S(H) — S(H) is a (possibly nonlinear) 
functional which describes the task we want to 
approximate. The generalization A,n, 5 of Aa can be 
given similarly. If 9 has the appropriate continuity 
and symmetry properties, the discussion in the 
section “General properties" applies completely, 
that is, we can assume covariance and permutation 
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invariance, and we can consider operations which 
use state estimation in an intermediate step. 


Purification 


Consider N quantum systems, all originally prepared in 
the same pure state c, and then subsequently exposed 
to the same (known) decoherence process, described by 
a depolarizing channel R. The task of purification is to 
produce M output systems which approximate the 
original pure input state as well as possible. Hence, 
the corresponding figure of merit arises with X — 
{R(a)|o pure] and 9(p)— R^!(p). This problem is 
discussed for qubits in Cirac et al. (1999), Keyl and 
Werner (2001) and D’Ariano et al. (2005). The 
optimal purifier can be given explicitly for all N, M in 
terms of irreducible SU(2) representations. Surpris- 
ingly, it turns out that the output purity can be 
improved even if the number of outputs, M, is larger 
than the number of available input systems, N 
(although N should be large enough). If we measure 
purity in terms of local errors, it can be shown that, in 
the limit N — oo, perfectly purified qubits can be 
produced at an infinite rate (i.e., the number of output 
systems per input system can become infinite). How- 
ever, we have to pay for this result with extremely large 
correlations between the output systems. Therefore, the 
global error does not disappear asymptotically, if we 
insist on a nonvanishing rate. 


Universal Not 


“Universal not” (UNOT) is an operation which 
sends each pure state o to its orthocomplement. This 
is a positive but not a completely positive operation. 
Hence, it cannot be performed by any physical 
device. However, we can try to approximate it by a 
cloning map T operating on N input systems. The 
corresponding figure of merit [18] arises if X is the 
set of pure states and £(p) — 1 — p. In Buzek et al. 
(1999), it is shown that the optimal solution to this 
problem (for all N and M) is to estimate and 
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D'RNTL 


The purpose of this article is to introduce some of the 
main ideas of optimal transportation theory. A lot 
more can be found in Villani's book (Villani 2003), in 
a somewhat similar spirit. Supplementary information 
is also available in Ambrosio et al. (2005), Evans and 
Gangbo (1999), and Rüschendorf and Rachev (1990). 


reprepare as described in the section *Relationships 
with quantum state estimation.” Approximating 
UNOT is, therefore, significantly more difficult 
than (pure-state) cloning, where the optimal solution 
is always (for finite M) better than estimation. 


See also: Channels in Quantum Information Theory; 
Compact Groups and Their Representations; Positive 
Maps on C*-algebras. 
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Transportation Maps 
Let us start by a rather abstract definition: 


Definition 1 Let X and Y be two topological 
spaces with Borel probability measures o and J, 
respectively. We say that a Borel map T: X — Y isa 
transportation map between (X, o) and (Y, 8) if, for 
each Borel subset A of Y, 


| a(dx) =f B(dy) 
T(x)cA ycA 


It is customary to say that T pushes forward o to 
B, or to say that 3 is the image of a by T. An abstract 
measure-theoretic result asserts that there is always 
such a transportation map T, as soon as o has no 
atom (i.e., the œ measure of any point x € X is zero). 

A more concrete situation is when X = Qo, Y = Q4, 
where Qo and Qı are two smooth bounded open 
subsets of the d-dimensional Euclidean space R^. In 
such a case, a classical result, due to Moser and 
improved by Dacorogna and Moser (1990), reads: 


Theorem 1 Let Q and Qı be two smooth bounded 
open sets in R^. Let po >0 and pı >Q be two 
smooth functions on R such that 


| px)dx — | pi de = 1 
Qo (4 


Then tbere is a smootb transportation map T 
between (Qo, po(x)dx) and (€,po(y)dy). Further- 
more, T is an orientation-preserving diffeomorpbism 
and solves the Jacobian equation: 


pi(T(x)) det(DT(x)) = po(x), VxEQo [1] 


Transportation Maps with Convex 
Potentials 


An important property of Moser’s construction, 
which we did not state, is the possibility of 
prescribing the restriction of T along the boundary 
No. If one does not care about this latter property, 
one can improve Theorem 1 as follows (Caffarelli 
1992): 


Theorem 2 Assume further that Q, is a uniformly 
strictly convex set. Then, there is a transportation 
map T with a smooth convex potential, namely 


T(x) = D®(x), Wx € No 


for some smooth convex function ® defined on 
R? and strictly convex on Qo. In addition, among 
all Borel maps T transporting (Qo, po(x)dx) to 
(Q4, pi(y)dy), Do is the unique map that minimizes 


inf | / ,IT(x) - x|" po(x)dx [2] 


where |-| denotes the Euclidean norm on R’. 


Because of its characterization, T=D® is often 
called the *optimal transportation map" with respect 
to the “transportation cost” [2]. Notice that, because 
of the Jacobian equation [1], ® automatically is a 
classical solution to the Monge—Ampere equation: 


pi(D®(x)) det(D^9(x)) = po(x), Vx EN [3] 
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(The Monge-Ampére equation is a famous geo- 
metric PDE, related to the seeking of hypersurfaces 
with prescribed Gaussian curvature.) The main gain 
with respect to Moser’s construction is the property 
that the optimal map T has, at each x € Qo, a 
Jacobian matrix DT(x) = D*®(x) which is a positive- 
definite symmetric matrix. This property has been 
first exploited by McCann (1997) and later by many 
authors (see Villani (2003), for many references) 
to prove a large series of geometric and functional 
inequalities. A very fine example can be found in 
Barthe (1998). Let us just consider, as an elementary 
illustration, a short and sharp proof of the isoperi- 
metric inequality using the optimal transportation 
map. 


A Proof of the Isoperimetric Inequality 
Using Optimal Transportation Maps 


Let us recall the isoperimetric inequality: 
Theorem 3 Let Q be a smooth bounded open 
subset in R*. Then 

JAN] > d|B" a 


holds true where B, is the unit ball in R*,|Q| and 
ƏN], respectively, denote tbe d-dimensional volume 
of Q and tbe (d — 1)-dimensional Hausdorff measure 
of tbe boundary OQ. In addition, the inequality 
becomes an equality if and only if Q is a ball. 


To prove this result, let us define densities: 


1 
— à Q 
po(x) lar xc 
(y) c B 
) = —., 
pily IB; y! 1 


and consider the associated optimal transportation 
map D® from (Qo, po(x)dx) to (Q1, po(y)dy). From 
the Monge-Ampeére equation, 

pi(D®(x)) det(D^9(x)) = po(x) 


we get: 
XER [4] 


Since the range of D on € is the unit ball Bj, we 
have 


f= D®(x) - n(x)do(x) € do(x) = |AQ| 
on a 
where n(x) and do(x) respectively, denote the out- 
ward unit normal and the (d — 1)-dimensional 
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Hausdorff measure along 02. Using the divergence 
theorem, we also have: 


[= | At Go)ds 


where A®(x) =trace(D?®(x)) is the Laplacian of 6. 
From the geometric mean inequality, we know that, 
for any symmetric matrix A > 0, 


(det A)'/4 < 1/d trace (A) 


holds true, with equality if and only if A is equal to 
the identity matrix multiplied by a non-negative 
scalar factor. Thus, 


I> d | (der(D*9(9) ^d 
0 


(because of [4]). So, we have obtained the isoperi- 
metric inequality: 


oa] > diBi]"^]op 


Let us now consider the case when this inequality 
becomes an equality. Then, necessarily, for each x € 
Q,A=D?®(x) satisfies detA— (trace(A)/d)4 and, 
therefore, must be the identity matrix multiplied by a 
scalar factor À > 0, possibly depending on x. Because 
of [4], the determinant of D*®(x) is constant over Q. 
Thus, A >0O must be constant. It follows that 
D®(x) = (x — a), for some point a in R^. Therefore, 
Q must be the ball centered at a of radius 1/4. 


Monge's Optimal Transportation Problem 


Theorem 2 is one of the numerous avatars of the so- 
called optimal transportation theory that goes back to 
Monge's mass transfer problem which addressed in 
1781 the *mémoire sur la théorie des déblais et des 
remblais’ and was completely renewed by Kantorovich 
in the 1940s (see e.g., Rüschendorf and Rachev (1990) 
for instance). Let us quote a typical result, similar to 
Theorem 2, but without regularity assumptions on the 
data (see Brenier and Caffarelli (1992)): 


Theorem 4 Let po be a non-negative Lebesgue 
integrable function on Rf, such that 


i po(x)dx = 1 
JR? 


Then for any Borel probability measure p,(dy) with 
compact support on RA, there is a unique map T 
transporting po(x)dx to p1(dy), which minimizes 


| T) — x\?po(x)de 


where |-| denotes the Euclidean norm on RÀ. In 
addition, there is a Lipschitz continuous convex 
function ® defined on R? such that T(x) = D(x) 
for po almost every x € RÊ, which implies: 


Í, f (De(x))po(x)dx = L f (y)pı (dy) 
a R 


for all continuous functions f on R°. 


Theorem 2, which can be interpreted as a 
regularity result with respect to Theorem 4, is the 
main output of Caffarelli’s regularity theory for 
transportation maps with convex potentials 
(Caffarelli 1992). Caffarelli’s analysis starts by a 
proof that ® actually is a weak solution of the 
Monge—Ampére equation [3] in the sense of Alex- 
androv and is strictly convex. Then, Caffarelli shows 
that D*® is Hólder continuous, as soon as pọ and pi 
are Hólder continuous. 

Notice that the convexity assumption for Q4 is 
crucial to insure the regularity of the convex 
potential. Caffarelli provided counter-examples 
when Q; is made of two separate balls attached 
together by a sufficiently thin pipe. 

Surprisingly enough, results such as Theorem 4 
are related to concrete applications in, for example, 
astrophysics, image processing, etc. (Frisch et al. 
2002, Haker and Tannenbaum 2003). 


The Kantorovich Optimal Transportation 
Problem 


The Monge optimal transportation problem can 
be solved using the Kantorovich duality method, 
based on the key concept of “generalized transpor- 
tation maps," also called “transportation plans" or 
*doubly stochastic measures." The abstract defini- 
tion is: 

Definition 2 Let X and Y be two topological 
spaces with Borel probability measures œ and 5, 
respectively. We say that a Borel probability 
measure u on X x Y is a generalized transportation 
map, or a transportation plan, if its marginals are, 
respectively, a and 3, namely 


| (dx, dy) = | a(dx) 
x€A,ycY Jx€A 


[| wldx.dy)= f oy) 
J x€X,ycB J yc B 


for all Borel subsets A and B of X and Y, 
respectively. 


i5] 


The Monge-Kantorovich (MK) optimal transpor- 
tation problem amounts, given a “transportation 


cost," that is, a continuous function c: X x Y — R, 
to find a minimizer for 


Ix = inf J c(x, yu (dx, dy) le 


where jz is subject to be a transportation plan 
between (X, a) and (Y, 8). Notice that this problem 
is convex (and can be seen as an infinite-dimensional 
linear program) and its dual problem can be easily 
computed (using, e.g., Rockafellar's theorem in 
convex analysis and assuming, for simplicity, that 
both X and Y are compact). 


Theorem 5 We have 


lux — sup] | atx)o (dx) + f boad (7 


a.b 


where (a,b) is any pair of continuous functions, 
defined on X and Y, respectively, and subject to: 


a(x) + b(y) < c(x,y), VxeX, Vy e Y 


Of course, each transportation map T, in the sense 
of Definition 1, can be seen as a transportation plan 
u in the Kantorovich framework, just by setting 


p(dx, dy) = é(y — T(x))a(dx) 


which means 


J p(dx, dy) = f a(dx) 
x€A,ycB J xcA,T(x)eB 


for all Borel subsets A and B of X and Y, 
respectively. Then, we have 


f ele, y)ulde,dy) = J c(x, T) Ja (dx) 


So, the MK problem can be seen as a "relaxed" 
version of the “classical” optimal transportation 
problem à /a Monge: 


IM = inf | ci. T(x))a(dx) [8] 


where T is subject to be a transportation map 
between (X,o) and (Y, 8). Indeed, we have Imk € 
Ix. It turns out that, in many important situations, 
there is no gap between these two values, which 
makes the MK problem a perfectly convenient 
convex substitute for the original, nonconvex, 
Monge transportation problem. This is, in particu- 
lar, the case of the situation considered in Theorem 
4, when the cost function is just 


2 
c(x,y) = |x — yl 
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or, more generally, c(x, y) — k(x — y), where k is a 
uniformly strictly convex function. A typical result is: 


Theorem 5 Let po be a non-negative Lebesgue 
integrable function on R*, with unit integral, and 
pi(dy) be a Borel probability measure with compact 
support on R’. Let k be a uniformly strictly convex 
function on R*. Then the MK problem 


lug = inf | k(y — x) (dx, dy) 
T 


where jt is subject to be a transportation plan 
between po(x)dx and pi(dy) on R, has a unique 
solution of form 


p(dx, dy) = 6(y — T(x))a(dx) 


where T is the unique minimizer of the Monge 


problem: 
by = inf J k(T (x) — x)po(x)dx 


among all transportation maps T between po(x)dx 
and pi(dy) on RA. In addition Ix = Im. 


Proof for Theorem 5 (Sketch) For simplicity, we 
assume that po and pı are both compactly supported 
in a ball B in R^ and we limit ourselves to the 
simplest cost function k(x) — |x|^/2. We first denote 
by M the set of all Borel regular probability 
measures v on B x B having po(x)dx and pi(dy) as 
marginals, which means 


(x)v(dx, dy) = J F(scjgo(s da 
f (y)v(dx, dy) = / facts) 


BxB 


BxB 


for all continuous functions f on R^. From Theorem 7, 
we deduce: 


max | x - yv(dx, dy) 
BxB 


veM 
= inf | [(x)po(ce) + ¥(x)pr(x)]dx 
JB 


where the infimum is taken over all pairs (, V) of 
continuous functions on B satisfying 


P(x) + V(y »2x.y, VxcB,VycB 


Then, it can be established that the infimum is attained 
by a pair (^, V) such that 6 is the restriction of a 
Lipschitz continuous convex function defined on Rf, 
and for po(x)dx almost every point of R?, V coincides 
with the Legendre-Fenchel transform of 6, 


LF(®)(y) = sup(x- y — ®(x)) 


xc R? 
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Moreover, if v=o € M maximizes [,j,5x-*yv 


(dx, dy), then 
P(x) + Wy) =x-y 


holds for Vop-almost every (x,y) € R x R’. Using 
well-known properties of the Legendre-Fenchel 
transform in convex analysis, one deduces that Vopr 
is necessarily of the form 


Vopt (dx, dy) = 6(y — D®(x)) po(x) dx 


which implies 


f, , fO) vog (dx, dy) = f, f (De(x))po(x) dx 
R° xR R 


for all continuous functions f on R? and achieves the 
proof since the second marginal of Vopr is p1(dy). 


The Wasserstein Distance 


Optimal transportation theory is strongly related to 
the geometric analysis of probability measures. For 
simplicity, let us just consider the space Prob(B) of 
all Borel probability measures p supported by some 
fixed ball B in R^. This space is compact for the 
weak topology of measures. An equivalent definition 
of this topology is provided by the distance d, 
naturally attached to the MK problem: 


1/2 
dips) -int( f iw- yPuldx.dy)) — e| 


where u is subject to be a transportation plan 
between po and pı on B. (Of course, more general 
convex functions k can be used to define the cost 
function.) It has become popular to call this distance 
as Wasserstein distance (or its generalizations for 
various k). It turns out that Prob(B) equipped with 
this distance has a formal Riemannian structure 
(Otto 2001, Ambrosio et al. 2005). For instance, 
given two probability measures po(x)dx and pi(x)dx, 
we can define a “shortest path" t— p(t,-) € 
Prob(B) such that p(0) = po, p(1) = p1, just by setting: 


pit dx) = / ó(a + (D®(a) — a)t — x)po(a)da, 
B 
Vt € [0, 1] 


where D® is the optimal transportation map 
between pọ and pı on B. This idea, which is 
somewhat related to the geometric analysis of 
hydrodynamics and various concepts of generalized 
flows Arnold and Khesin 1998, Brenier, was 
successfully used by McCann (1997) and Otto 
(2001). In particular, the concept of convexity 
along these geodesic paths on Prob(B) has been 
pointed out by McCann (1997) to be a crucial tool 
for new proofs of geometric and functional inequal- 
ities. Otto, and other contributors (see Ambrosio 
et al. (2005) for a comprehensive discussion), observed 
that many important parabolic or dissipative evolu- 
tion PDEs can be described as “gradient flows” (or 
“steepest descent”) of such functionals, with respect 
to the Wasserstein metric. 
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Introduction 


The exponential function, the logarithm, the trigo- 
nometric functions, and various other functions are 
often used in mathematics and physics. They are 
transcendental functions in the sense that they 
cannot be obtained by a finite number of operations 
as a solution of an algebraic (polynomial) equation. 
Typically, they are obtained by a Taylor series 
expansion. Many other higher transcendental func- 
tions arise in mathematical physics, often as solu- 
tions of differential equations. A precise knowledge 
of the behavior of such functions, their relation with 
other functions, addition, multiplication and com- 
position properties, representations as an infinite 
series, or as an integral, often shed a lot of light onto 
the problem in which they arise. If they are 
sufficiently useful to a large audience, then they 
usually get a name and they will be called special 
functions. In what follows, we describe a few of 
these special functions of one variable, but clearly 
this is just a tip of the iceberg. Many other special 
functions exist and we refer to the classical tables of 
Abramowitz and Stegun (1964) and the Bateman 
manuscript project (Erdélyi et al. 1953-55) for more 
special functions. Nowadays, there have been 
numerous g-extensions of special functions (see 
q-Special Functions). 


Gamma and Beta Function 


The gamma function is defined by 
l(z) = f tle'dt, Rz>0. [1] 
0 


It satisfies the functional equation T(z + 1) — zT (z) 
and since I'(1)=1 we have T(n 4- 1) 2 z! for 5 € N. 
The gamma function therefore extends the factorial 
function for integers to complex numbers. The 
functional equation 


T 


l'(z)T(1 —z) == i2] 

sin 7Z 
allows to continue the gamma function analytically 
to Xz «0 and the gamma function becomes an 
analytic function in the complex plane, with a 


simple pole at 0 and at all the negative integers. 
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The residue of l'(z) at z= —7 is equal to (— 1)" /z!. 
Legendre's duplication formula is 


22-1 
Ja 


from which one can obtain the special value 
r(1/2)= s. Finally, two useful infinite product 
representations are 


Dizz) = £ Ciziri + 1/2) [3] 


n'n? 
ra= im -———————— 
(z) noo z(z + 1)---(z +n) 


and 


n=1 


where ^ is Euler’s constant: 


y = lim (X cte] — 0.5772156649... [4] 


n= 
k=1 


The beta function is a function of two variables 
given by 


1 
5») = [ E11 = 27 di 
Rx > 0, Ry > 0 [5] 


Clearly it satisfies B(x, y) — B(y, x) and it is related to 
the gamma function by 


l'(x)P(y) 6 


aee l'(x +) 


The gamma and beta function are quite useful in 
probability theory. One of the most common 
probability distributions on the positive real line is 
the gamma distribution 


1 iá 
Pr( 4 xm) = r) eHait x0 
0 


B*T'(a). 


The case à — 3/2 is the Maxwell-Boltzmann dis- 
tribution. The most common probability distribu- 
tion on the interval [0, 1] is the beta distribution 


J pr0 n ds 
0 


PHYEX e re 5 


where 0 € x < 1. 
The psi function is the logarithmic derivative of 
the gamma function 


p(z) = [7] 
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It is meromorphic with simple poles at 0 and at the 
negative integers. Special values are (1) — —^ and 


where ^ is Euler's constant. These can be obtained 
from the functional equation 


viz) = v(z + 1) -- 


Bessel Functions 


Bessel's differential equation is 


xy" + xy! + (x? y 20 [8] 
where derivatives are with respect to x and v is a 
complex number. This differential equation has a 
regular singularity at x =O and an irregular singu- 
larity at x —oo. The standard method of finding a 
solution in the neighborhood of a regular singularity 
gives the solution 


2) x14) 
JA) = (x/2) 3 kie (k 4- v 4- 1) 


and / ,(x) is another solution (if v Æ 0). The 
function J, is called the “Bessel function of the first 
kind” and v is the “order” of the Bessel function. 
The series x "J,(x) is an entire function of the 
variable x. The function 

J/(x) cos(vz) — J_,(x) 

Y,(x) = =a 
sin(r7) 

is also a solution of Bessel's differential equation 
and is known as the “Bessel function of the second 
kind of order v." Two other solutions that are often 
used are 


Hi y) = 


which are the first and second “Hankel functions.” 
Bessel functions appear if one solves the wave 
equation in cylindrical or spherical coordinates, using 
separation of variables. The Helmholtz equation 
V?F + &F — 0 in cylindrical coordinates p, $, z is 


OF 10F 189F OF 


paai ip 
aa 58s AoT aa c F=0 


and if we look for a solution of the form 
f (p)g(ó)b(z), then this leads to a differential equation 
for f of the form 


where a and v are separation constants. The general 
solution is f(p) — Z,(p(k^ — a*)), where Z, is any of 
the Bessel functions given higher or linear combina- 
tions of them. In spherical coordinates r, 0, ó the 
Helmholtz equation is 


o? F 20F 1 OF cot 0 OF 
Or rór moe r 00 
i Pr LR F= 0 

r2 sin? $09? 


and for a solution of the form /f(r)g(0)b(ó) one 
obtains a differential equation for f of the form 


1d? (rf) 


r d£ 


) LP — vv + 1)/P If = 0 


with general solution f(r) = Z, (1/5) (kr)/ T. 
Bessel functions have very simple differentiation 
formulas: 


HEC 
Iz "Jo(z)] = 
The first formula can be seen as a lowering 


operation, the second as a raising operation. Some 
integral representations are 


(z/2)" j 


z' fal (z) 
—€ "Jai (e) 


- i 2v z 
fg) = JzT 4 1/2) Jp sin^" 0 cos(z cos 0)d0 
Or 
(z/2)" | 23v—1/2 .. 
Jak?) -e uU — x^) cos zx dx 


which hold for Rv > —1/2. For real v the Bessel 
function /, has infinitely many real zeros, and when 
v > —], then all the zeros are real. All the zeros are 
simple (except possibly at the origin). Each of 
the functions /,(z), Y,(z), HP (z), or H'?)(z) satisfies 
the recurrence relation 


zd, 1(Z) + zay41(z) = 2va,(z) 
and the differential-recurrence relation 


dy 1(Z) = 4, 1(z) E 2a, (z) 
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Modified Bessel Functions 
The modified Bessel equation is 


D- P (x 


x^y" + xy! — (x^ +y7)y = 0 [9] 


Clearly /,(ix) is a solution of this equation. The 
*modified Bessel function of the first kind" is 
defined as 


I (x) —e "Pf (xe?) —r <argx < r/2 [10] 


so that 


ui Put (x/2y* 
s RTW +k + 1) 


If v is not an integer, then I(x) and I ,(x) are two 
linearly independent solutions of [9], and when v= 
is an integer one has I,,(x)=I_,(x). The “modified 
Bessel function of the second kind" is defined by 

T 


K,(x) - Uv (x) v L,(x)] 


2 sin v7 


Some special cases of modified Bessel functions are 


2 

lx) = yz sinhx 
| 2 

I_4/2(x) = — cosh x 


Ky /2(x) = 55° 


and 


One has the integral representation 


OO 
Ky(z) = | e *°°ShX cosh vx dx 
0 


V 1 
1,(z) — (z/2) TK. u xij M2 gat dy 


vmT(»- 1/2) J- 


whenever Rv —1/2. The “Airy functions" are 
given by 


Aur] = YS 


|L.1s(6) — liya(Q)] = E 
Hi 4/3(¢) + 11/3(¢)| 


where ¢ = 22?/?/3. They are both a solution of Airy’s 
differential equation 


y" (z) — zy(z) =0 


Bi(z) — 


Hypergeometric Series 


A power series $57, 4c4z" is said to be hypergeo- 
metric when the ratio c,,1/c, is a rational function 
of the index n. Most series that one finds in calculus 
textbooks are hypergeometric series and some of 
them define important special functions. When 


Cul — (n + a1)(n + a2) -- (n F ap) 
c, — (n4 bi)(n- ba) -- (n+ b,)(n +1) 


then we write the corresponding series as 
1:705, i a ; dp 
p* q 
Puit 


me -5 au il Mp) La [11] 


n= j( m * (54), n 


where  (a),—a(a4-1)(a-2)---(a--n — 1, with 
(a)y—1, is the rising factorial or Pochhammer 
symbol. When p and g are small, one also uses the 
notation »E,(a1,...,ap;D1,..., b5;z) where a semi- 
colon (;) is used to separate the parameters in the 
numerator from the parameters in the denominator 
and also to separate the parameters from the 
variable z. Some special cases are: 


e the exponential series 


e the binomial series 
Fol -aii —£) = Eije = (1+3) 
n=() 
e the logarithmic function 
F1 (1, 1; 2; z) y= = i -= — -log(1.— z) 
n ” € 
e the Bessel function 


(z/2)"oFi(— v 4-1; —2^/4) ^ T(v + 1), (z) 

For generic values of the parameters, we see that the 
hypergeometric series converges everywhere in the 
complex plane when q > p, it converges for |z| < 1 
when p =q + 1, and for p > q + 1 it is only defined at 
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z — 0. When one of the numerator parameters is a 
negative integer, say a; =—m, then the series is 
terminating and defines a polynomial of degree m. 
None of the denominator parameters is allowed to 
be a negative integer —m, unless there is a 
numerator parameter which is a negative integer 
—k with k<m. For q p, the hypergeometric 
series therefore defines an entire function which is 
the corresponding hypergeometric function. For 
p=q + 1, the hypergeometric series only converges 
in the open unit disk, but sometimes it can be 
continued analytically to a larger domain in the 
complex plane. The analytic continuation of the 
hypergeometric series is then called the hypergeo- 
metric function. Take for example the geometric 
series, then it is clear that the hypergeometric 
series converges in the open unit disk, but the 
corresponding hypergeometric function is defined 
in the whole complex plane with a simple pole at 
z=1. The logarithmic function —log (1 — z) has a 
hypergeometric series in the open unit disk, but it 
can be continued analytically to the complex plane 
with a cut along [1,oc6) and a branch point 
at z— 1. 


Gauss Hypergeometric Function 


The most famous hypergeometric function is the 
Gauss hypergeometric function defined for |z| — 1 
by the hypergeometric series 


Filabje2)= y Qba ua 


MGE 


which is often denoted by F(a, b; c; z). It is a solution 
of the hypergeometric equation 


z(1 — z)y"(z) + [c — (a 4- b + 1)z]y'(z) 
— aby(z) = 0 [13] 


and this solution is regular at z=0. Obviously, 
,Fi(a,b; c;z)=2Fi(b,a;c;z). The six functions 
Fi(a + 1, b; cz), 2Fi(a,b + 1; ¢;z), and 5 Fi(a, b; c + 
1;z) are called contiguous to »F;(a, b; c;z) and there 
are 15 linear relations (with coefficients which are 
linear functions of z) between 5Fj(a,5;c;z) and any 
two contiguous functions. Two of these relations are 


(2a — c — az + bz)F(a,b;c;z) + (c - a)F(a — 1,b;c;z) 
+a(z—1)F(a+1,v;c;z) =0 


and 


c(a — (c — b)z) F(a, b; c;z) — ac(1 — z)F(a + 1,b;c;z) 
+ (c — a)(c — b)zF(a,b:¢+1;z) =0 


Euler gave the integral representation 


2F; (a, b; c; 2) 
i I'(c) s s = ge 
. T (b) (c — b) / (1 — zx)" un — 


for Rc » 0 and Rb>O0. This allows to find the 
analytic continuation from the open unit disk to the 
complex plane. A useful result is the Gauss summa- 
tion formula 


l'(c)F (c — a — b) 


l'(c — a) (c — b) 
R(c—a—b)»0 


;F1(a,b;c; 1) = 


The special case for a terminating series is known as 
the Chu-Vandermonde sum 


2Fı(—n,a;c; 1) = 


Pfaffs transformation is 


2Fi (a,b; c;z) = (1 —z) ^2Fi (a.c — hs r) 


and Euler’s transformation is 


jFi(a,b;e;z) = (1—2) ^ ^3 Fi(e -a,c — bicz) 


Confluent Hypergeometric Function 


The hypergeometric series ;Fj;(a;c;z) defines an 
entire function in the complex plane and satisfies 
the differential equation 


zy" (z) + (c — z)y (z) — ay(z) = 0 [15] 


This hypergeometric series (and the differential equa- 
tion) are formally obtained from 5Fj(a,b;c;z/b) by 
letting b — oo, which gives a confluence of two of the 
singularities at z— oc. This is the reason why the 
differential equation [15] is known as the confluent 
hypergeometric equation. The solution 


®(a,c;z) = 1F1(a:c;z) [16] 


is called a confluent hypergeometric function, and a 
second linearly independent solution of [15] is 
zl-*$(c —a+ 1,2 — c;z). The function 


I'(1- c) 
I(a—c-4 1) 

l'(c — 1) 
TG) 


Tla gg) = P(a,c;z) 


z*$(a—c4-1,2—cz) [17] 


is therefore also a solution of eqn [15]. The 
following integral representations hold: 


$(a,c;z) = rte) i ex1 — x) dx 
0 


l'(aT(c—a 


whenever Rc > Ra > 0, and 


1 ds T 
V(a,c;z)— 7 e211 4 x) dx 
0 


T (a) 


whenever Ra > 0. 
The “Whittaker functions" are defined as 


with A— —a--c/2 and w=(c—1)/2. They are a 
solution of the Whittaker equation 


! 1 A 1-442 
y (z) + Cini " Jy) = iQ 


The “parabolic cylinder functions” are also con- 
fluent hypergeometric functions. They are given by 


D,(z) = 2"/*e-* /^w(—v/2,1/2; 27/2) 
= We I1 — v)/2,3/2; 22/2) 


When v is a non-negative integer, one finds Hermite 
polynomials 


H,,(z) - 2" e [2 p, (/2z) 


Classical Orthogonal Polynomials 


A family of polynomials [p,(x),n € N}, where p, 
has degree n, is orthogonal on the real line if there is 
a positive measure u on the real line for which 


Pu(x)Pw (dul) = Pubs [18] 


Usually the measure jz is absolutely continuous, in 
which case da(x)=w(x)dx with w a non-negative 
density function on the real line, or u is discrete and 
supported on a finite or at most countable set. Any 
family of orthogonal polynomials satisfies a “three- 
term recurrence relation” 


XPu(x) = AnPnii(X) + Baps(x) + Cups-1(x) [19] 
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with C,A,-1>0 for every n>1. For the 
monic polynomials P,(x)—p,(x)/k,, with k= 
1/(AoA1A42 -- - A4 1) this relation becomes 


Pai (x) = (x — bn)Pu(x) — a, P, 1 (x) 


with b,— B, and a2—A, 4C, This recurrence 
relation gives rise to a tridiagonal matrix 


b a 0 0 0 O0 
a1 bi a» 0 0 0 
0 a» b. a3 0 0 
J = 0 0 a3 b3 d4 0 


0 0 0 a 
0 0 0 Ọ 


which is formally symmetric and which is called the 
“Jacobi matrix.” The spectral measure of this opera- 
tor, acting on £5(IN), is equal to the orthogonality 
measure u whenever this symmetric operator can be 
extended to a self-adjoint operator. If this is not 
possible in a unique way — a situation which can occur 
for unbounded operators only - then every self-adjoint 
extension of J gives rise to a spectral measure which 
can be used for the orthogonality conditions [18]. In 
this case, there are infinitely many positive measures 
which can be used in the orthogonality relations and 
all these measures have the same moments 


My = ja d(x) 
R 


Some families of orthogonal polynomials have 
additional properties which are quite useful in 
many practical and physical applications, such as 
the following: 


e The derivatives p’, are again a family of orthogo- 
nal polynomials (Hahn property). 

e The polynomials p, satisfy a second-order linear 
differential equation of the form 


o(x)y" (x) + ry (x) = Any(x) 


where ø is a polynomial of degree at most 2, 7 is a 
polynomial of degree 1, both independent of n, 
and A, is a real number (Bochner property). 

e The polynomials can be obtained by a Rodrigues 
formula 


10(x) Pal) = Cn- (w(x)o"(x)) 


where w is a non-negative function and o a 
polynomial of degree at most 2 (Hildebrand 


property). 


There are three families of orthogonal polynomials 
on the real line which have these three properties, and 
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each of these three properties characterizes these 
families. These are the Hermite polynomials, the 
Laguerre polynomials, and the Jacobi polynomials. In 
a more general situation when the orthogonality 
relation is described by a linear functional and the 
functional is not required to be positive, one has an 
additional family of Bessel polynomials. The densities 
w(x) for these families all satisfy a first-order differ- 
ential equation [o(x)w(x)]'=7(x)w(x), where ø is a 
polynomial of degree at most 2 and 7 a polynomial of 
degree 1. This equation is known as the “Pearson 
equation.” 


Hermite Polynomials 


Hermite polynomials H,(x) are orthogonal with 
respect to the normal density w(x) =e’: 


J H,(x)Hy(x)e-* di = 2" nlós 
Observe that the density satisfies w’ = —2xw so that 
o=1 and r(x) = —2x. The recurrence relation is 
Hay (x) = 2x7, (x) — 20H, 4(x) 


and the polynomials satisfy the second-order differ- 
ential equation 


y" (x) — 2xy' (x) + 2ny(x) = 0 


The functions b,(x) 2 e-* /" H,(x) satisfy the differ- 
ential equation 


b" (x) + (25 --1— x^)b, (x) — 0 


The derivatives satisfy H;(x) — 24H, i(x) (lowering 
operation) and one also has [e * H,(x)] = —e 
H,,.1(x) (raising operation). The Rodrigues formula is 
"d 
dx? 


— y? 


e* H,(x) = (—1) 


The polynomials can be written as a hypergeometric 
series 


H,(x) = (2x) 5 Fo(—/2, —(n — 1)/2; 5; —1/x^) 
or alternatively as 


| /2| k n—2k 
B (—1) (2x) 
Hz) 2. kl(n — 2k)! 


Their generating function is 


Hermite polynomials are relevant for the analysis of 
the quantum harmonic oscillator, and the lowering 
and raising operators there correspond to creation 
and annihilation. 


Laguerre Polynomials 


Laguerre polynomials L°(x) are for a > —1 orthogo- 
nal with respect to the gamma density w(x) — x^e * 
on [0, oo): 


| Lo(x)L, (x)x"e * dx — —— —- 
0 n 


The Pearson equation is [xw] = (a + 1 — x)w so that 
o(x)=x and r(x) 2a + 1— x. The recurrence rela- 
tion is 


(n + 1)L; 4(x) 
= (2n -- a + 1 x)L2(x) — (n + a)L2 (x) 
and the differential equation is 
xy" (x) + (æ + 1 — x)y'(x) + ny(x) = 0 


The functions £,(x) =x°/*e*/* L(x) satisfy 


(xE y + (n + 
Differentiation has the effect that 
[Ln (x)] = -Lati (x) 
and 
[xe L (x)| = (n + ija e FLE (x) 
The Rodrigues formula is 


id 
n! dx” 


ee iz) = [amm 


The hypergeometric expression is 
nlL^(x) = (a+ 1), 1F1(—550 + 1; x) 


and the generating function is 


oo 


So La(x)e" = (1-2) 7 exp( —) 


n=0 t 


Laguerre polynomials occur as eigenfunctions of the 
hydrogen atom. 
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Jacobi Polynomials 


Jacobi polynomials P'*”)(x) are orthogonal for the 
beta density :(x)—(1—x)"(1-2-x) on [—1,1] 
whenever o » —1 and 8 >-1: 


| 
f "e ^er? cya - x + 2)! de 


7 22H D(n 4- o 4- 1)T (n 4- B 4- 1) 
|. 2n- a - B 1 Pia -+a@+ 8-1) 


Onan 


The Pearson equation is [(1— x^)w]| -[8 — a — 
(a+ B 4- 2)x|tw and the differential equation is 


(1 — x^" (x) + [B — a — (a+ 8 + 2)x]y' (x) 
+(e + a+ B+ Iyi) =0 


Differentiation has the effect 
f / 
[Pi 9 (x) | = (n+ a + B+ 1)/2 P0 D (x) 
and 


(1 — x)*(1 + x)? Plo?) (x)| 
= —2(n-- a (1 4 x) A (xe) 


n4-1 


The Rodrigues formula is 


(1— af" +a) PES er) 


-1 B d" n+a n+ 
j E dx” K =x) (A +a) " 


In terms of hypergeometric series, one has 


pex) = ET tn 
(P ES 
x 2Fi 
a 4- 1 2 


Observe that one has P!”)(— x) =(—1)"P\% (x). 
Special cases of the Jacobi polynomials are as 
follows: 


e The “Legendre polynomials" P,,(x) = P!°)(x). 
They appear when the Laplacian is separated in 
spherical coordinates as functions of the polar 
angle 0, for which x — cos 0. 


e The “Chebyshev polynomials” of the first kind 
T,(x) = PEYU (gf pg tt) 
and of the second kind 


U, (x) = (n + 1)P21/2 (x) PEYR) 


These functions are more easily written by using the 
change of variable x = cos0 and then T,(cos0) = 
cos nô and U,( cos 0) = sin (n + 1)0/ sin 0. 

e The “Gegenbauer polynomials" or ultraspherical 
polynomials are Jacobi polynomials with equal 
parameters: 


CÀ (æ) = (22),/( + 1/2), PO-1/2971/2 (s) 


Gegenbauer polynomials are involved in the 
angular or spatial part of the wave function of 
physical systems in a central potential in both 
position and momentum space, and in the spatial 
part of the wave function of hydrogenic systems in 
momentum space, as well as in the eigenfunctions of 
several quantum-mechanical potentials, such as the 
relativistic harmonic oscillator. 


Other Classical Orthogonal Polynomials 


Instead of restricting attention to the differential 
operator D — d/dx, one can also use the (forward) 
difference operator A for which Af(x) — f(x +1) — 
f(x), the divided difference operator A) for which 
AA (x) = Af(A(x))/ AA(x) with a quadratic function 
A, or certain q-difference operators and look for 
orthogonal polynomials that satisfy difference equa- 
tions in the variable x. Together with the three-term 
recurrence relation (in the degree n), one then has 
families of polynomials satisfying a bispectral 
problem. For the difference operator and the divided 
difference operator, this gives several important 
families of orthogonal polynomials which all have 
a hypergeometric representation. These hypergeo- 
metric polynomials are usually listed in a table, and 
each level indicates the number of parameters and/or 
the order of the hypergeometric function. This table 
is known as Askey's table and is given in Figure 1. 
The extension with q-difference operators involves 
basic hypergeometric series and q-extensions of 
classical orthogonal polynomials. 

“Charlier polynomials" C,(x;a) are orthogonal 
with respect to the Poisson distribution 


ak 


» Cn(k;a)Cm(k;a) pi 7 8 [d non 


> 
I 
© 


The recurrence relation is 


aCni1(x;3 a) + (x ^ n — a)C, (x; a) 
+nC,_1(x;a) = 0 


and the second-order difference equation is 


ay(x -1)--(n—x-—a)y(x) 4- xy(x —1)20 
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Figure 1 Askey’s table. 


The forward difference operator has the effect 
AC, (x;a) = —n/aC,_1(x;a) and the backward differ- 
ence operator Vf(x)—f(x) — f(x — 1) has the effect 
Vl|a*/x!C,(xsa)] =a*/x!C,41(x3a). The hypergeo- 
metric representation is C,(x;a)=2Fo(—n — x; —; 
—1/a). Observe that the variable x appears as a 
parameter of the hypergeometric series. 

“Krawtchouk polynomials" K,(x;p, N) are ortho- 
gonal with respect to the binomial distribution: 


N 
» Kf p, N)K, (kh; p, N) (ra — pN- 
k=0 


— P) 655 


where N is a positive integer and 0 « p « 1. They 
are given by K,(x;p,N)=2F\(—n, —x; —N;1/p) 
and correspond to Meixner polynomials for which 
the parameter ( is a negative integer. 

“Meixner polynomials" »1,(x; 8, c) are orthogonal 
with respect to the negative binomial distribution 
(Pascal distribution) 


3 th. (Of — n! | 
2 ms B c)m;(k; B, c) h! — c" (8), (1 Lm 


where 870 and 0<c<1. They are given by 
Mn Dc) = 3 F1( —n, —85 51 — 1 fe). 

*Meixner-Pollaczek polynomials" P^(x;$) are 
orthogonal on (— oo, oo): 


J en (x; Q)P^ (x; $)eU9-?*|r (A + ix) dx 


. 2aT (n 4- 24) 
(2.sin 9) ^n! 


m,n 


where A > 0 and 0 < ġ < x. The appropriate differ- 
ence operator 6 has an imaginary shift óf(x)— 


f(x-4-i/2)—f(x—i/2) and one has 6P(x;¢)= 
2 sin oP / ^ (xs à). They are given by 


KA wes _ n ind =M AAP EN 
PEUX) = yl € 2Fi( 2) 


“Hahn and dual Hahn polynomials” are orthogo- 
nal on a finite set of points. Hahn polynomials are 


given by 
1) 


and their orthogonality is with respect to a 
hypergeometric distribution on {0,1,..., N}. The 
appropriate difference operator is the (forward) 
difference operator A. They are related to the 3 — j 
symbols or Wigner coefficients that arise when 
considering angular momenta in two quantum 
systems. Dual Hahn polynomials are given by 


—n,nta+6+1,-x 


Qn (xi, &,N) = aFo( q4-1,—N 


u fy 3,8 y 4-841 
RAREN cs (^ ER n) 


where A(x)-—x(x----ó-4-1). They are obtained 
from the Hahn polynomials by interchanging the 
roles of n and x. They are orthogonal on the set 
(4(0), 4(1),..., A(N)). The appropriate difference 
operator is the divided difference operator which 
acts on f as Af (A(x))/ AA(x). 

*Continuous Hahn and dual Hahn polynomials" 
are orthogonal on the real line. The continuous 
Hahn polynomials are 


Dn(x; a; bic; d) 
E" (a+c), (a 4- d), 
= n! 
pm 
X 3F? 
at+c,a+d 


') 
and the appropriate difference operator is the 


difference operator 6 with imaginary shift. The 
continuous dual Hahn polynomials are 


Sx ra, b, c) es (a + b), (a d c), 


lerra 
x 3E; 
a+b,a+e 


and the appropriate difference operator is the divided 
difference operator which acts on f as óf (x^)/óx?. 
*Wilson polynomials" are the most general system 
of hypergeometric polynomials satisfying a bispec- 
tral problem. All the other classical orthogonal 
polynomials can be obtained from them by taking 


appropriate parameters or as limiting cases. They are 
given by 


W,(x^;a, b, c, d) 
(a+ b), (a +c) (a * d), 
pe l,.d-rix,d —1x 
= 4F3 
a+b,a+c,a+d 


j 
and for 3t(a, b, c, d) > 0 (with nonreal parts appear- 


ing in conjugate pairs) they are orthogonal on the 
positive real line with respect to the weight function 


Tr(a + ix)F(b + ix) (c + ix)T'(d + ix) 


w(x) =| l' ix) 


^Racah polynomials” can be obtained from 
Wilson polynomials when the parameters are such 
that one of a+b, a+c, or a+d is a negative 
integer —N. They are given by 


R4,(A(x);a, D, ^r, ó) 
M" Maite VAR ae 
ok & 4-1, 8-5 + LHI 


wherea --1— —Nor8--6ó--1— —Nor»y-- 1— —N, 
and N is a non-negative integer. They are orthogonal on 
the finite set (4(0), A(1),..., A(N)], where A(x) = x(x + 
y+6+41). They arise as 6 — j symbols in the coupling 
of three angular momenta. 


See also: Combinatorics: Overview; Compact Groups 
and their Representations; Integrable Systems: 
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Overview; Painlevé Equations; q-Special Functions; 
Random Matrix Theory in Physics; Separation of 
Variables for Differential Equations. 
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1E. Eeti A a AE P PADAR PRN. dm f 11 8D ARE — ^ E AB E U EA 
P(z) 4% U ES HK. PO —detCzE —U) 35 A 5i JE EA T. 3x AAE zb E E F BN 
A dp 4 $8 5) HDPE VB 3X S 7r ik E MUR PA. o EA RE CAE XJ. 6 3X8] — I E 9. [E ix A 
EAHA RY. EAA Wi T ix BH XS Xp RA A BB (Taro Asano). 4 ik 9j 4 — 4 3E 
[R] zc 3€ CE dE TF MRE) KRNGFERHRRE ze tm KERMAP AmA Ez ot, WAA 
Ae ,.m,).Qmi tz, X T 5E oz 8E — OX By. 4440] EIC dE EN REGE HR — X £X 
Qizi eez ES Q.: A Ble |<l.-.|z, |<1 BA Qle,..z, ) FO. Aik, wR PCz)— 
Qs. EQ HM PE EIS. ERR RT RE ot eer, E 
i Ie | 1. A mi [8| — 1.0 RB oR QUO eim FQ ent mimus EQ HI 
Qloriya QC emt mimus) 
HEQPKRUNMAHE-TFPLEANEZR KZAKRFAH.CHQPHAEHMAEAQTH 
$3 X. 
Q(z, ,***,z,)—Az;z; t+ Be, +Cz.+D 
RKocBPAA.B.C.D ERE oen PRE 2. Aham ARENS nA. R a A AA 
EB x. Ay AE E E xy, TE 
Az,z,+ Bz; +Cz+D~Az, +D 
Kkh—- m 5$ XQ uU E. -KRARBH RNA — S m—1 76 $ SU de E SUH 
EQ P. Pr AE BU ah X he Q F. (GE — 8 3 o1 Az, HD Re Az HBO+ 
D fh PARZ ER B3 8 ECCL RT DLE m AR — 15a, SL. MARKEE m.m, B37D n 
zi Fas Gs te) S 
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NSERAHEQP. (CF SRKETSA WHA -PRH zen GRHA—-PDHSFERFRAHA DRE AS ED Gy 
PD. KUSH A MMR. SRM -SREKRAKH RK -KRAEGH KRESHANKRERS Fe WKN A H — 
eB fr eH. HLH jg ag »— lag Sl. FMA 


P(z) = a z'* [T IIa; (*) 


XCt1.-*.m) JEX EEX 


AY FARA T EJ EQ. 

B He 3m 4 3€ 9E XCIK SEE. K REOR T ib SAL UL. CE SE SU 3 RON. Bohr) fj X "y. JA Pj A838 EP. TU 
USES Pp RU ERR ee. TERPS 3 3€ 8 EHR SE LKR AA CERE TELGENSNE CIS 
4 38 38 i6 RM ABA MEKS EE de fuo PEN LE. ELSE TEX p OE Eu i HS wae LTR”. 
Zi i fep ULOE WR. ECCE HE R7] BL 7 98 7b EB OP VEU FUR fs PRR: ZHOHFREP I] REE Bo CEST R P L E — 7F $6 
Me Ak mb SM AH de AE. TEEESEONU XL USE 1958 4E X F Br UE dE HL a SEE AE AUR CE IE T 5 0o H f. 
Wee QE EC dA fu RARRBROKA AL EAR KAEA ROCK KEARSE (8 CE Pr N E E 0E g RR th 
Kit SE QEON NE CUENTE EE CREE A TE A E e A e A SS E CE dE 
THE. MES ERE KARRPRER-RAMKEHRAHT CER. HW SKER-RADMEP RRR 
NAZ ROA —-S XEXUDAO T E E E e Xr E a re o. 

Pr DL X SEP ME E E) JE SR. — H6 X F Ja SK x T 303€ 0E RA] dn d 3k (Freeman Dyson)). m X — 35 A E #48 5 Cfr] 3m 
& 4% ti (Harish Chandra) , 1$ 4¢ (Roul Boti)) . fi 41 JA 47 38 9E C 7E x, TRE R. RSK AH Bat T E B (Edward Witten. 
1951— 2.1990 FRR dE p 25 E 1 3E DBS RRB 1976 4E XE EG EA E EE UL IR X E = (2004) H F H (David 
Gross) b4 di 5E PRE 490 38 F E E AE (E18 fi P A GCSE S EA. 

MAF 9p E 39 I, or 3€ de e DRS? 

miU h FFE E H A a t E GE EG BARB KME GARFEH. 8 E. 
ds do d EE A ESXE MILA REHA MRTE FE EE. ES: E. 
AK wD ILS.” 

URKREANMRHKS. 

KEAR MUAA F, EOP RRS Bit DEAR AH RAR PIF SS BY AB: 

A4 4038 83D 5.5 UE 0 BY XO RORE IUE. ZRELE AAAS BAF RAF LE TF IE HAA. TRARP 
jJ] * Hp i Yr qu e| SD UB $5 ck REEUE EB 6 (IRL m EE BAPHE  BRRRADABRSRS DET E.D 
Sh NE TITLE XqECLLTELCELESTGCLTESTECSELbQXELZX Tq ELELSE LE EE eR 
Jr ik. [Bb p.p x E NES OR X do ie fJ CHM. ECFRKRETANMH SM CHAR HIA HE AEN E BY JL fup 0E fo E ux 
DEINDE A MEHR ARS Ha AAEM EO S SEES QC MIECE E p RE E e E Rupe 
XE RE SEEEUESU SUPER CXtTqtdECEA-AB4EEERAgEh CER ESSE. SP d acu 
Jk, 2] JE. -T- Jf] oe dii xk AEKOBL T EMAAR pog PEE fue X Lax e ETT CLE T b 7r ifo d f RAMS CE ES f x 
AX, 7g dE FO WHS dE bc SRAM RAR 3 P EC TER. 7] i8 8E d] E fe 39 38 E TEE] o SR T E TE UE 9D 8 
DUE RUE E EGER SERE Ek k e a e a ie E EA S LASHER RHE. BLY ZEE: R JL 
BES p de xp REG du dh A [8 fqORE GG XE M 3E AAA 7e dE FU p 8 UE OE R E E, CMRP. 
$5 3 d) fk EK ETA. m EARN IEEE LU kc JU fs] 38 PIAA EB OK IE TR I d dh 
Td dE d) 3E do dE ib rcp ck 4 1*8 X deo vd i8 3s 0k CUT fe RECTE REURHE AE EEUU ELE E T 2| 7I fo E 
HPNARP SREAALAHMOARSRPROAK. AFAMKRAF,. BRHTIRFRRKAANHAR. AR. SS EL 
fu 3t 3E 2r m.an JR 0 BS PBR i EFE EE av HI 81 T Yr S (XC CE Xr Lax (6 0X dB db Rah ET 
# SILA SILA SHH KALA aE X H L ASRS BEES ES LS EDO A aE h E AR PEE 
2E fH 400 3E p S8 BR UL SAMPLE NAME LO FR HUI 0h 71 HERE MAAK RRKS Pn PEUPLE LIT Z 


Q Wty tet. RU.“ Statistical Theory of Equations of State and Phase transition. Il. Lattice Gas and Ising Model". Phys. Rev. 
(2) 87(1952),.410-419; th WL T. Asano. “Theorems on the partition functions of the Heisenberg feeromagnets”. J. Phys. Soc. Japan. 29 
(1970) ,350-359. K BAW HAR 8829 Em — 7E (ur PL ee HA CL D. Ruelle, “Extension of the Lee-Yang circle theorem". Phys. Rev. Lett. 26 
(1971) .303-304). tii Hd i Jg fe ix ^P 9j B, 05 58 43 AR HR os (LBS A. (2010 E., BR KR T — KT He FA EY OL 
Characterization of Lee-Yang polynomials. Annals of Mathematics. 171(2010) .589-603. —— PE iE.) 
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HH. 

RF at Ae AK? 

Ew GH a PAS : 

LR RS RARLGEEN. SLL ASP ARMS MESES Re. a A AM lel Lf a E 
ERT — OR OGLN ARON. RRM aA eK SB LR — AN RIT ae LAT Be TES MA TE TL 
fy bY Pr A dE dp A A — EK ERE ES RE” PARAM BRM TARAS ESS: i A 
yi TT DA 36 qu Hx Es (6 VE 3E Te] SAR AL WRK AR. Te UR 7 REN AMR SRK Be, 
RAT 83 MERU PRLS 8 IB X L2 Pr 8E ANH. 

Hf adh AA RR TA RH TER RIL. Al FM REAR Me He Ft 
MeN CE HERI A E y a E A TA R HK. RRR ENR K MP ERA XO WO GE S ALES 
eo AM AMS BY Rt CAR RK SHEE. 

fS SC STE RM REDE MRAP RK AMT 48 on 4 42 28 SSRSZAWNAEKA. SLL MERGER RG 
WHS RT RAL RFR HM EAE RER TFT SANA. SAN 38 mE DEC dE SC Je CAE To RK G fhe Re 
i BY AR EL. DA BBY He GB ee Fe fa AK ERX OO OBE AK Ey fo ek EC SH qM A 
3 Op $E 5 Ros fe 3E JU] 38 zb UR A] ay A SE E RC IBOOK. TUR GAMES RAR SH -NRULS HW SR SS 
le] B X AREE EO HD M. ode fA 0 YE PESE CUL Z6 R ER, AR Y gt (X fa SL YE S Ede A a CHE 8 
BAP RR RET EA ROC JE E TERI RF RKE e 3 9E CZ E N56 TE a, E 1p OK ERE MKRMRA, HMR AT 
“Re P NI RTE ARR O Im NEAR AMAR Vf BL Z6 EE y A F CAL ZEE EN DES R fS RERA CE 
ix ^M LEGE] OT BERGE fu p RH HRT 4E BN 3C. AE UL TE 9T R MI ofa XB] BARMERA RSA KE 
i AR. A Hy 38 Wl zm A GY RRA TN SH GR — a PIL OR SIL. le Y D X duode. X Wd 
8 [I 17, 3,:"1912 FR RAAR AS BW di E qb eR TKR Hh E RA XOU XA. TA db np AR 
nid 1€ E OE, £z X URN HOST OR T JU Ak mh. REAR BASH XE Jb E 46 REN 7L T8 R8, 18 9 AERE e Ses FIA GR FL 
RAREZA. RAA pr EAREN RRRA AFRA A p E ds dE RE. (VET Re. Ae 
ERZ. H8 v RAMA BREED XM.” 

HA tL ty eR HS Hii SOK 1844 FE X WH (Lineale Ausdehnungslehre) (4 3E 45 38 W )). 3x AK -B f e 5 Hy 
MRLE- ARAFATA, GAERA «dE T6 HR OE VUA E JU Tr ERRAER ARARE. ReEE 
FAKE d NELLE EIE XE CL EE ESTXNEETS REEL MEE E EJ MEE EE LEE. 
MANZE RREZE- TAARE RNL PA. dn £6 CAE ZEE Zo D" 3E da 9E ib Zo dE SRUU EMEC AE” X 
TZ ØA” -REREH WGRU. MRA RMT oux DX a d dk) BpYR WW WE Add 5 mp CT Gnd 
BE. 8 51862 FKP HM T 76 REUS YT AK Odor AE EUR T Aea E US ED Ep Rm Le PL EE 
Eit y — id Ausdehnungslehre(# Hit) FF] VA ti T fi 8^ 9E R TVA TF (EXE 4E VR Rl. nu JL dp Ep f RAR eR 
Te du RY eS EPR AS e vp B qu JW. (B Xe fu 3d By ix NE TR AR AAR. ATS BA "n HLT”. 

AMBER TRDRARSDESRSEWMRARHE ERE. REP Ae ah XE SIL ACE EE 
By — ^ X T OBL D d6 A NK LAE BEN rn BD oue void ui Ee [RH — g& Xe ge zo oECE 8E 3px ETD 
D EA PI RK EPR SR. wena AR “Al eR RSPR RH CARAT ATS EE 
Al ie deu RE MATRA L ix Aus | yo d E SE OR OK Ur TECDLÓ RBH E 1B 0 — ATE A8 HAA LUC PUA RSH 


eta Le. 


eg žr us I r3 f 


(D HS + RT S. OEmEHIunWri1s44E3£ I BH. FF n] BY HL Gesammelte mathematische und physikalische Werke, *8 1 2$. 
He H6 85.1894 4E. oR — uh T 1898 4ESE LIESS. 
D FAM. 1862 E. VL FEE FE SESS 13828 — DAT SEL BH. 1896 4. 
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Ja 3H A XE UE 9E — 3E I3 7C CAR pL SÉ WH AERA 7r E) 
udu, eu 
dr oy ag 


H XH HSA E Pr di 3E BE v 477 38 ORC UE SHE A E REL 10 LO E ato fA de dE — E ELE dE F EN 
RHEE KN BT BEY uU). 5| 1851 FRET ANB eie oc ESL e X ie E i6 dy dE IBARRA TM 
f 3n A e] ER AE DN FE E E A. dT AER EU RIT fed E WARS HB PY ERA E cf Po wl m RN. 
TE Ede X AG a No dep ax EE B de T dA Zn ib. RENDIR ARE Mix —upW RAT CHS ZRSRAERH 
FF dr Bt CK. W. Weierstrass.1815 —1897) t^ A 9t db E RS KIARA e ML BUE — 0€ e EE SER OM Fl 
EN BR 3x dE OR BEC REN. RERAT CA Om dE EEUU ES E EB ex id ve mE Hts. FAR 
1E ^, o 35 RI Ib RE JE BT — 9E XE AM. 1866 ERREEN, X T as f| yo ERB Raw eA fub. 
1870 F S /k NEL 26 b — ^R 5 a8 f| so RE RC PLE LE 3X AS BILE P E16 0E B 3 AE TELE aA Fl TR S I 
mH /\ (EY o o EE HA RG re eh Fl 9 BB. Pe Fl OY Oe TUS EUM E E a P DA 
RRNA A PEE 48 ou |I 33. HEMP RR AE Wy dE EE. ILE UE MUR LL RRUEX.I870 dE 21 HL" REOR HEIC" 48 
HY — AiE 91,1890 FAA EK RAK TPE AS S m 36 EL" 3 C428 BT — S UE VLL 3x XS E LA GE 
H E YET Rae uy. RRA KE DR DL 38 A LA fp EL] De RP AE E RK EARNS RD RI 
36 8 m EUG SH d iEaX SE u EE p. FEE ZETA ix—mq$ee.tBd«uzi4&dy—5543.A"»ü39 
RERI . C SE R2 e VERE — p ORO] AAR FRA ER os RSH UU d Pe oe qp OY eh Fl ELE E 
B JA ARAN] BY 2, P zie xm OB A HT! 

BEWA — TF 4E Et” dk daB f xd RBRE Ra TE Ze. 1899 +. ew me Se RA IR TB EON UG 
XRH BELLE HN Roe ol”. ft TUE MEET EE SC RARE E R Pe ES A TORR P Lu a ME m: 
A dk dx ub Wt ws ROTE UN KE MARR RE LU SE PE PLUR LEE] BLUE CE R RPRERKRENHRLEMS Hl. 
m Je ART XD ub a gE m mni (67r xk.fhdk EDGE ORA ERRE dnx xk. p dg Bi RD 
[e] AE P d RAR ECC PE ME E c Bh DR D ob se eR HD DAC 38 RE B XL CODE, RMR FRNA 
WA yo m E E E, UE Y OME. BRRKHA i BN 3x — VE SE IDE WR Y MARE RBA”. 8 3€ B 7R RR 
X 7 — A By EA. Tu di — 6 FOE T MARE RBA OBE. 

REIRE SHETE uA FEA qx d S BRK. x e 3x 0E E e BN E — 39 WE OROCU 3638 zT Mp EE DN EE dm 
Yang-Mills Mt 227 &* X 36, El XE fk E oa -F 4E E PUR E SAM d d PUR E fo Chern-Simons op PE X K {8 he E E > [8] B 
3L BE WYK A QUE 3E SE e p BD Ee [8] 55 Calabi-Yan SA zc fe] by at M X. Ede 0 3€ 9E LS ERR A 3E Hib B BH 
E Fo SE S RARER GE E Lc JS dp oe VIE UE SR SHEA 6 Pp E LE" CHE MEET 2n 0 48 3€ DIT. 

XE yb R E ee TL E d EUCH E EE I OA 20 HE 80 IE AX EX IE ME oc di ub FR FRAT C b m H5 E (E JA h 
Jk EN AK 3e PILE — AERE EC GB. UR E HL SC BR P e XE Tu Le AL AUR RA AMM. SRM ee 
a HG ACT oY. 

A ufu E RE SECURA 0 ELA LUE E. di dE mb E 6 By A E CL Jv AFH HAA PIA. AE RB E i6 A 
A X ARCA EIC 3E. MES PHAM CHAS E S PE TEILS OE E N EK. EOM T HX B OG VEO MERA KT RS 
£F 2E M 7r MMAR. BMH KE. RE LC de HU 8I — de x. RAAT RAK. FABRA DHS R 
H AREE ix—d dp & gt IE dk ORE AMAR SEHE S TE EL 6 3p D CERE HOA DKW AR 
Bike Huw- PERT. AAR? CRRARERAFH bk 6T 9E Ky SES REEL EH 
-ARF Cf TREA FAIRE m ak dS (0 EL dE 3€ 06 7E BARE ER. 

MdB S Bk DAE BERE TM3À5 429 0] C08 CLE 1291 26 1XXCÓE M 3E 9E RR LER I PE LEUR IL SEI 
RAG BRAK ETE 5 de de LE T MeL Xd xrie ET S skies MEA RRSSAA ETSE 
FUR. € Fu AFRA AIRA FARR A SPF PARS LEA GE IX SRP ERR ARM FEF 
Bt MLA GK IU iL fT OR EAT AE RL A EL AT OB 3b ve OD FO Mit KE I ER EE 
KR FH KBR ER DRADER. 

APH EAMERMER SG PSB MKS PBPK SE fu p E SRAAFRNHHR PR — 3E LED PI REX BAR UE. — 
Jr 8 .uinJH X pix Iddqu BEER 4 pix dg qd A T3» —7r: 8 MS SEC EC AE GET 
X RBS JR AR. E te IE dna UL REESE K E C25 Utrecht KÆ Gerard' t Hooft Ait E A E (9 48 B Ln 3E E 5 E 
HAZEN NOM BERS. MEP 2&9] 3:3: XLd8 Le RHE A THF E HE RR H8 3E 3E 09 FIL Ee 
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SP.—-SHCREERREREHEBRKKEKREN. MES IP 2r ie ds 5 EU ILU 

—TKEAARKENAMFRARMBFSRESEE LOEO REOR Y ZA KBHC? RERERRA KK RERE 
SHS € —Cfbp üEB AERE HMO. ATE SANA RKERRRASH T SELBE E TEX — ER ECECK BN FEB LRA 
SU V X838 eie me. DAT ETATS Xx ie B. €T 4x5 WXEXX—^cmumdqmquE 
X WEE 38 7 X8 i6 WU dB 98 7A ESR PREPARA OB UAZMRPRE-ARFLHFEHL. Wu. X 
AERE MKE ARRA ARF AERA SHR. EA UBAN ERRER M TE EE 239 EP 
HRN R38 3x 108 X AL 1486 2| 77 fa ARF MHRA ASH À — ix — BET M0 3ESEC IDEE — S E ALPTDEX E 
4 4j 95 CH. T REGAL. TE 71A XE LXX 3 PLACAS LII AH ie 38 OG X B CEU X dB GRE ES. BMANUAAMEELS 
A CS UK AO E REL] E F7] SEU 38 :p — Tp Hc E 6 y 7 x d 38 OX 1: JL CNA fo ECOLE PAIS UA Im REET 
FULEREN 3x IR do RAAB RET AF” GEL KO Be ib IE ix P fe E i6 E F WDR A 36 HE L0 3c HE HR LH 
Wr Aime. 

A EIS xt E WÀu—1238X*. UL NAX dox — ag. B XY. 

"P 3 OC CST 25 E PE AS (CREE XR EH C1949— 19760 ) CrP A OCA 101-2014) £ 8.6 46, E EL ALAN T PX BLUT 
WARE ERAH RULAXCTIS-BITOUNAE.1965 #8 A 14 H.E XE FE s Ep — di 9; 9| 46 3 X FA je] MHRA: 
“BRA FH AE TÁC: He NA Zi. [B d EGER AE S ESE IK A Aw (0E M EA f$ 38 di d] EANA HAD HEA 
RAs MM PRA TAE ERRERA eX A EAE TT SUESOUIETRLXSXEJSEJG0 ARF 
Jr fe 3 FEE i D F MN E IX LM CX LE E XR IEEE M PLA S 3p € 500 AAF). RAM. ERRMAFHK 
B ix ERK MEURE AA. AA 08 0] i 7g H HN. HAW 3385 B CR EC VT Zr AKARAKARA ARRI, A 
B CA GEI MER ETDADEE d PIX ATA. 

UEX RECESSUM SES C LEEREN P NEGRO ER d EE ESOS GUN KARIN RRR 
THR KATE ik H p e LR AR US MARE Ve A. CULTE DIET. AUGE K XR 8 dp — ox EX — 0 £o 
— 4 W b. AEERIRIEEST6I A EA E SEOREUGL?T TEE XL NH EZERRE, EA aR aE 
it Ht EOD 

Yt — ^R ER E] 48 3X1] — 5€ 3E DER 9L 32€ RUE IB CX TL dE € 4x) 0 IR T EC PE IB CUBE Ab db 7 BBAHS 
KPO ZLEERARAARAAN AR EA (46 B.E HET OE Ux EÉOGA CB. $41 E 3c] 8 98 8 ROS PAMARE 
DE SE 3 ES EN EO E NIE S8 E262 00: b SPEC SG E-NEMCPE PO; E. E. do CP MVE S i E EP 
^ f $6 4/1 RH dé F . T EER Hr do am is E AR AP [8] ER VE EE REC IP 3X EI RUN BET 

V UA KS HX Hi] esquivalience? FAH? 3B f 9p A d: xk TF MK HA HF XB 8] E) New Oxford American 
Dictionary)? . € H 2 -& Vr fr 3x Ads] 8] 3& E EU CX E E UBI E H A.l EI 36 1H 3L, 3X YE AE JR B.E X. esquiver. * KH 
He”. 

Aitik EER PRA aH MWR RH MAMSA HFA LE RIEL. RMS A Bix i], Bt he HF 
T AY Ae T. 

^ ft4 ix RET HAA MRAM ECM +R 3E 1E 8] 2) MRAM VOR EE NI. A? 8] AE THY 8] ? 
Aj BN AKA TUF XX eR? 

LESEN RAN LAR RRM) SALE -KGOREPRARHE SABHA FE i] RE, fl 
A Nob ETE I BI OS He AR OO OY. 

KERPSERN SEMA. SFSFRRU-KEEH MR 4f NE iE IA RED Jefe 0g] dE 3-3] 39 Ws 5 KL 
XB E h fedl aya? i SE EE 2 IUEE LEE Aaa EE EA A R 3 6 3:8 5-5, 

€ X esquivalience 3X A i] H 3, Ze (Nr AF 3E 3X i8 8] JUL Pp dp ds] E E L3 — ox EX 9I AR Lux 8 EU T R 
KER i KE DEN. 

LEE qe RRR HERS HRM. FERMERE RRA REAREA T FARR AD 
Wr. BERAE- TERI HARA A P ER MRLKMARHRA HAR HB CE — 7 PARCO GEN PE A HEN 
Ze- WR XAR. 

AME CL OP. ZR GR AI E ACII AE. CAERA KEA mR AREER FABRA. — 39 d$ dE 
KP GRUB TE A TL EE P: — DER Xmxxk.—uexxsxi. 

RUATAU- AEAF A, CPIXErI CLER RANBIR RAEE T KH 
Jf fp dk E a, RETRE ME. DA. ERAPR R, RIIE RE a A BR? 
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EX EG eB ME AR (Elsevier) A 8] F 2006 6 FH i hk BY 3X (Encyclopedia of Mathematical Physics) CC 9E 473€ AB 
HAP ERTE A SS A ddr A eiiumuJem. 

AK) = i Ee GE BEL BE AK Jean-Pierre Francoise 2$ 4€ , XH HIME E 4E X E Gregory L. Naber #4 t X 
E] + XX Tsou Sheung Tsun 18 4-2 RÆ K 3] JA FZ F 4] 3827 p 9T 2 0 e 5 CE E. fe] A Y a HE E ULR o SE CE XR 8 
3H dT dX E AS Roger Penrose ARE N WN 34 fr E E 4 E SECURE CLE B Rm SS RAxNERA. 
2] 2A 3 B 30 ^ E] ACRI 439 pr E y E SE o RSHAARARMAWAM TR Bite RRS REST 400 $$ AX 
JE IN Ag ab PE KE, 

(c 3€ 4p 38A BRE E EK S A FERN- 3B 0 A UL.IUOAXOE x) NB T ARH.4-)T ft HH.ÉBEdWT 
SR Ai n UE LR SEMEL EE REBSRANM HE RRE LEUR DELE 28 5C 0E IE R A UC enses. 

RAB wR eRe BN ULP S XE OXEOUE ALL, Li UR £8 37] 0E 9| E T7] 0E Z4HAKAZT ib Ld mde 
A 45 4 35 i6 XR 9) 7] 9E 9 9 7] 0 HER A H9 76 FR RH URL EE ELK USE ES LB i6 8| € T: 51 7) 9,08 38 36 8| 4 — 
Bit FMA AI BSE RAAT S EE Eb C REESE QE Se LEE sape qu EE DAMA 
LR S E EB du ECCE E CENE JL AT E E R A DIEN IMIECECIE MINCE E ENSE MCN ERU Ed 
DE A EE AY i SUE EE ACER ZEM HET AE E EENAA A. 

KP H e A HABA A RAE E aE. E gr TE E a a d XE rU — V.P E EG 
ie TE KE E SECT Ub OR LA UE die fi E 35 BY BY E ACIE h A EPH A. Ya. Povzner E d$. t ih W MB “TM ET 
3 VA J& Pr An ai B^] VL E 7 BRN D e VE e. 4E EN F GIA TKR. RRR, Ale Si Povzner "3X Æ R Mi 
NATNAKRBRARNRA. Bee MEUS PE LEE B dE EE Ec 

iE 29 Gerard' t Hooft Pp 48 ih ay ab: 

A SE MBSA SH AE dE COE RE PV. BRASH PH Re Meee NT. dB Rp mn ME RAK JUR SE TK 0) 7] E 
TA T BE BY vO SEP SER CE Ho E A te IR SEE P A OR E IE. f m. E FL 38 3€ 6, Donaldson- Witten 3€ W #1 AdS/CFT 
Xp LAE Ah 3k UE ON PF 

LG E X KERR EAN MD KER ETZAVSqEG Ea. EARE F RMR S E 
AE E p YT EGRE HOO T fe 3] T ARR. AN XBR, FHF DEN 4A XGA. KN 333x552 
XC f$ ÁR AF BY COR. 

DOE E P 

H d XC ds RF ty EE SE d uA d] Ho. ee E 2E — 11 tO d SL SERE. EP Ve X SET 1976 E. Xj 
AR. dr nb RE SS BS RHR. LER LT REAL ARER, ENEN T EC OE EUR O30 28 fn 
MAALAA HERATA METRA- AARAA EE E ha, 

Jk 5E 32 38 30, EE E E EP 8L — 36 UE LE X a a AR. -AW. ue E 0E ix — LEE 
^ Hr 3 E BN d vh DECR E Ze pb x 26 RAMA, RRA AFRA ARRET X ER UR UR. PL EA RM 65 
J& 5| -F- XE XD Hp 38 D HM. PRI sp E A 38 Ede M X627 ORB Y SEA. iy SE E38 AE SE B n e] RR 2) 
T kA JUS 1 AR URL Veh RE VRNEACKRSMHBAKBRAAD SHAH. ERR E38: v 100 A. 

BRaAXxSRAISRMMASMBAKERAPIE-TPREWME. WKREERGRE-MRARALH.SHRFHANM 
H.T]E 4412 ft RENIE MMAR ETERRA ER. RALAT YF f F E dE REL, SUI fac ELE BY 
OBERE EE EIE BE em TY ER RE eH te) PR EI SE LEE 

RP MBE -IRHNSA ECHRARAMH MA. CANAAN EARNER ERNMREH ME PEE P E m 
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